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INTRODUCTION 


This  Final  Technical  Report  constitutes  a  summary  of  the 
research  performed  under  Grant  N00014'90-J-1712  during  the 
period  of  April  1,  1990  through  September  30,  1992.  First  we 
present  a  list  of  the  personnel  involved  in  the  research  effort.  Then 
in  the  following  section  we  present  a  brief  summary  of  the  research 
results  that  have  been  achieved.  Each  of  these  results  is  well 
documented  in  technical  articles,  and  references  to  these  articles 
are  made  in  the  summary  of  the  research  results.  We  hope  you  find 
these  interesting. 
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A  SURVEY  OF  RESULTS 


In  this  final  technical  report  we  will  briefly  comment  upon  our 
research  accomplishments  sponsored  by  Grant  N00014-90-J-1712. 
Much  of  our  work  during  this  period  was  concerned  with  various 
aspects  of  random  fields.  The  principal  subareas  of  research 
activity  may  be  characterized  by  the  following; 

Research  in  distributed  estimation  which  might  readily  arise  in  a 
typical  estimation  problem  in  the  context  of  random  fields. 

Optimal  estimation  with  respect  to  a  large  family  of  cost 
functions, 

Decentralized  estimation  with  nontraditional  fidelity  criteria, 
Multidimensional  quantization  which  could  arise  in  the  effort  to 
quantize  a  random  field. 

Multidimensional  convolution, 

The  concept  of  finite  memory  of  a  stochastic  process  or  a  random 
field, 

Distribution  of  the  determinant  of  a  random  matrix. 

Stationary  random  processes. 

Zero-crossing  rates  for  Gaussian  processes, 

Martingale  characteristics  of  a  Weiner  process. 

Estimation  of  a  random  variable  based  on  multidimensional  data. 
Detection  Theory  versus  Hypothesis  Testing, 

Importance  Sampling 
,  and 

Mutual  Independence 


A.  Distributed  Estimation 


The  results  achieved  in  the  area  of  distributed  estimation  are 
found  in  Appendix  A  in  the  paper  entitled,  "Some  Aspects  of  Fusion  in 
Estimation  Theory,"  which  appeared  in  the  the  March  1991  issue  of 
IEEE  Transactions  on  Information  Theory.  In  this  paper  we 
considered  the  problem  of  fusion  in  estimation  theory.  We  presented 
several  examples,  using  common  distributions,  in  which  virtually 
any  method  of  fusion  would  be  useless  in  approximating  the  random 
variable  of  interest.  Further,  we  presented  a  theorem  which  for  a 
very  general  situation  shows  that  fusion  resulting  in  an  almost 
surely  exact  approximation  is  always  possible.  In  particular,  this 
result  addressed  the  situation  in  which  the  data  consisted  of  the 
random  variable  of  interest  corrupted  by  additive  Gaussian  noise  and 
the  random  variable  of  interest  could  be  any  second  order  random 
variable.  An  example  was  presented  which  illustrates  the  utility  of 
this  result. 
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Some  Aspects  of  Fusion  in  Estimation  Theory 

Eric  B.  Hall,  Alan  E.  Wessel,  and  Gary  L.  Wise 


Abstract  — The  prohlen  of  fusing  or  combining  various  estimates  to 
obtain  a  single  good  estimate  is  investigated.  Several  examples  are 
presented  in  which  virtually  any  method  of  fusion  fails.  Finally,  a  very 
general  situation  is  considered  and  an  example  is  presented  in  which 
almost  surely  exact  fusion  is  always  possible. 

InAtx  Terms — Fusion,  distributed  estimation,  conditional  expectation. 


I.  Introduction 

In  thi.s  correspondence  wc  consider  the  problem  of  fusion  in 
estimation  theory.  Our  primary  concern  is  directed  toward  find¬ 
ing  a  method  of  fusing  or  combining  a  finite  number  of  esti¬ 
mates  of  a  fixed  second  order  random  variable  X  in  order  to 
achieve  a  single  “best”  estimate  of  X  Our  concern  throughout 
this  paper  is  directed  toward  minimizing  the  mean-square  error 
(mse).  In  this  context,  for  an  arbitrary  probability  space 
(fl,  .S*',/*),  we  recall  [1]  that  it  is  necessary  to  lake  versions  of 
conditional  expectations  which  are  Borel  measurable  functions 
of  the  conditioned  random  variables,  and  we  do  .so  throughout 
the  correspondence. 

As  an  example  of  fusion  in  estimation,  if  V,  and  K;  arc 
random  variables,  how  might  E[X\Y^]  and  £[A'|y';l  be  fused  to 
obtain  a  good  approximation  to  £[ A' I y, ,  >% ],  Notice  that  the 
statistical  knowledge  required  to  calculate  ElXlYi  Jand  £lA'|y,l 
is  less  than  that  required  to  calculate  £[A'|y|,y;.l,  since  £[^^>',1 
and  ElA'IYj]  can  be  obtained  from  the  appropriate  bivariate 
distributions,  whereas  £{A’iy|,y;)  in  general  cannot.  Also,  Y, 
and  y^  might  not  be  simultaneously  available  to  the  person 
desiring  £[A'|y|,y,),  a  situation  occurring  in  the  usual  context 
of  distributed  estimation  in  which  a  central  location  desires  to 
construct  a  good  estimate  based  on  the  estimates  obtained  by  a 
set  of  spatially  distinct  field  observers.  Although  other  authors 
have  attempted  to  address  the  problem  of  fusion  in  estimation 
theory  (sec  for  example  [2)-[5Jl.  many  important  questions  in 
this  area  have  not  been  resolved.  In  this  correspondence  wc 
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place  the  problem  of  fusion  in  estimation  theory  in  a  rigorous 
setting  and  addicss  and  answer  several  crucial  questions  For  a 
treatment  of  fusitin  via  linear  comhinalions  of  best  linear  esii- 
mates,  of  fusion  via  linear  combinations  of  best  Horcl  measur¬ 
able  estimates,  and  for  other  comments  on  this  general  problem, 
wc  refer  the  reader  to  (fi). 

II.  Sosn  Dinuitiiis  and  Ci  riosh  ii  s 

We  will  now  consider  the  general  problem  of  lusion  in  which 
a  Borel  measurable  transformation  of  best  Borel  measurable 
estimates  is  sought.  In  particular,  if  .V  is  a  second-order  random 
variable,  if  rt  is  a  positive  integer,  and  if  y,.y,.-  -.Y,,  arc  ri 
random  variables,  how  may  £[  A'iY,].  £[A'  V',  1.  ■  •  .  and  £(A'  >'„] 
be  fused  to  approximate  £[A':y|.y-.- •  •.>'„}’  In  other  words, 
under  what  conditions  is  £( ,Vi£[ A’  V’, ).  £(A' V-]  '.£[ACy„l) 

(the  best  £,  estimate  of  X  based  on  a  Bore)  measurable 
function  of  ^[.ViY,].  £[A'iy2l.  •.  and  £lA' V',,])  a  good  ap¬ 

proximation  of  £[/Yiy,.y,.-  '.Ynl?  As  the  following  examples 
indicate,  there  are  numerous  subtleties  that  arise  in  this  context. 
For  example,  as  will  be  seen,  even  if 

£[A'iy,)=  £[A'!V',]=  =  £[.Y  y„]  a  s.. 

£(Any|,y;,-  •  -  .yn]  could  be  wildly  different  from  £j,V  >'|j. 

Example  I:  Let  n  =  [0. 1),  -V  denote  the  Borel  subsets  of  (1. 
and  P  denote  Lebesgue  measure  on  't .  Let  /I  be  a  positive  real 
number,  erty, )  =  (t([().  1/2)),  o-l  Y,)  =  fr([l /4.  .3/4)).  and 
X(,a>)-A  for  oi  e[().  l/4)u[l/2,3/4)  and  Xioj)^  -  A  for 
(o  e{l/4.  l/2)u[3/4, 1].  Then  it  straightforwardh  follows  that 
£|A'|y,]=  £[A'iy.)=0  as.,  but  £{ 3riy,,y,]  =  i,  a.s.  Notice 
that  in  this  special  case,  any  linear  combination  of  £[,YiV',  ]  and 
£(Any2]  yields  an  estimate  equal  to  0  a.s..  resulting  in  a  mse  in 
approximating  X  of  A',  which  can  exceed  any  preassigned 
real  number.  Recalling  that  £[A'iy|]  and  £[A‘  Y,  ].  respec¬ 
tively,  are  <7(y,)-measurablc  and  <7(y,)-mcasurahic,  wc  see  that 
£lA'iy|l=  £[2f|y,)=  0  poiniwisc  in  w:  similarly,  we  sec  that 
£[Any,,y;]=  A'  pointwise  in  w.  Thus,  in  this  situation,  it  is 
fruitle.ss  to  attempt  to  approximate  X  based  on  am  function  of 
^A’ly.land  ElA-IYJ. 

Note  that  in  Example  1.  Y,  and  Y\  arc  independent.  A’  and 
y,  are  independent,  and  X  and  Y,  are  independent.  This  mighi 
have  led  an  unwary  investigator  to  perhaps  assert  that 
£(AriY,,Y2l=  £(A'l  a.s.,  or  perhaps  that  £[  A'lV',,  Y,]  =  Ef.VT'i 
a.s.  for  (=1  or  2.  Each  of  these  assertions,  which  happen  to  be 
equivalent  in  the  setting  of  Example  1,  is  incorrect. 

Note.  also,  that  Example  1  concerned  simple  random  vari¬ 
ables,  The  phenomenon  exhibited  in  Example  1.  htiwever.  can 
hold  for  nonsimple  random  variables  as  shown  in  the  following 
three  examples  which  involve  more  commonplace  distributions 
of  random  variables. 

Example  2:  Let  Y,  and  V’,  he  independent  Gaussian  random 
variables  defined  on  the  same  probability  space,  each  having 
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zero-mean  and  unit  variance.  Let  A'  =  ViV;  Then  T(.V  >',1  = 
£[,V|y,]  =  0  a.s.  whereas  £|  A'lT,,  V',] -  A"  a.s. 

Example  3:  Let  V,  and  T,  be  independent  random  variables 
defined  on  the  same  probability  space,  such  that  T,  is  uniformly 
distributed  on  ( -  1. 1]  and  T,  has  a  probability  density  function 
given  by  f(x)=  x’/vTir  cxp(  -  .r"/2),  and  let  X  =  TiTi.  It  can 
be  shown  that  X  has  a  Gaussian  distribution  with  zero-mean 
and  unit  variance  (cf.  [7,  pp.  172,  176)).  Then  again  £(A'iV'|l  = 
£[A’|y;.l=0  a.s.,  whereas  £[  A'lV',,  Vi]  =  A*  a.s. 

Example  4:  For  an  integer  n>  !,  consider  a  set  of  random 
variables  LV.V',,-  ■  .V',)  with  a  joint  probability  density  function 
given,  as  in  [8],  by 


exp 


I  +  X  exp 


n 


n 


I  .V,  exp 


It  follows  straightforwardly  that  the  set  {A". T,.  is  not 

mutually  Gaussian  and  not  mutually  independent,  yet  any  proper 
subset  cf  {AT, y,,-  ■  ■,Y„)  containing  at  least  two  random  variables 
is  mutually  independent,  matually  Gaussian,  and  identically 
distributed  with  each  random  variable  having  zero-mean  and 
unit  variance.  For  any  nonempty  proper  subset  :?  of  {y,.  ••  .Y„). 
we  note  that  E[X\:y\-V)  a.s.  since  X  is  independent  of  2/', 
However,  it  follows  quickly  that 

e[x\y,,-,y„] 


2\f2 


y,  ■  y„  exp 


a.s. 


Thus,  since  any  Borel  measurable  function  of  the  estimates 
E[X\^]  where  2?  ranges  over  all  nonempty  proper  subsets  of 
{Y,,'--,Y„)  would  be  constant  almost  surely,  it  would  not  be 
reasonable  to  attempt  to  estimate  £[A'|y,.' • -.y^]  based  on  a 
combination  of  these  estimates. 

Notice  that  in  Example  2  the  observations  are  Gaussian,  and 
in  Example  3  the  signal  of  interest  is  Gaussian.  Further,  in 
Example  4  the  signal  of  interest  is  Gaussian,  the  obser.ations 
are  mutually  Gaussian,  and  the  problem  under  consideration  is 
expanded  to  include  fusion  of  estimates  of  the  form  £{A'|J^], 
where  is  any  nonempty  proper  subset  of  the  observations.  In 
each  case,  estimation  of  X  via  fusion  is  hopeless  and  even  the 
ubiquitous  Gaussian  assumption  does  not  alleviate  this  diffi¬ 
culty.  However,  as  will  be  .shown  next,  with  an  appropriate 
restriction  on  the  observations,  almost  surety  exact  fusion  is 
possible. 


III.  Fusion  in  a  Practical  Setting 

Let  (fl,  .5^,  P)  be  a  fixed  probability  space  on  which  all  of  the 
following  random  variables  will  be  defined,  and  let  n  be  a 
positive  integer.  Wc  will  now  consider  an  approach  motivated  by 
more  practical  concerns.  The  following  notation  will  be  used 
throughout  the  remainder  of  this  section.  Let  y,  =  A"  -i-  N,.  where 
A",  A?|,  Afj,'  ■ i'Vn  arc  mutually  independent,  Al^  represents  addi¬ 
tive  noise,  and  Af  is  a  second-order  random  variable  represent¬ 
ing  the  signal  of  interest.  As  before,  wc  consider  the  problem  of 
estimating  X  via  some  combination  of  the  £[A'|y,l’s. 

Assume  that  A/,.  AI2."  ■  .Al„  each  possesses  a  probability  den¬ 
sity  function.  Let  £^  denote  the  distribution  function  of  A",  and 
/v  denote  a  Borel  measurable  density  function  of  N,.  Further, 
notice  that  via  a  straightforward  change  of  variable  and  using 


the  independence  assumption,  we  have  th.it 

/  <  n <  )if/ .(  <  ) 

E{X%.-  - 

/  riisO’  «),/£,( A ) 

- 1 


Now,  in  addition  to  these  assumptions,  assume  thai  loi  each 
positive  integer  1  <  n.  .\  has  a  zero-mean  Ciaussian  diviribuiion 
with  a  positive  variance  denoted  by  o-.  .Notice  that  in  this  case, 
choosing  continuous  versions  of  the  density  tuncuons.  we  have 


ri/so:  -  0=  n 

/  i 


( >:  -v )' 


/2rr,rr 

A  1 

n  -  r 

<  -  1  y  2— rr  ' 


£  exp 


exp 


exp 


-  A 


:.vi  r 


where  wc  define 


”  v; 


”  y- 


and 


K  = 


r  -  1 


exp 


Si'*‘s<itiition  into  the  previous  expression  for  £[ .V  . VJ 
now  implies  that 


£l2riy,.--,y„}  = 


/  A  exp 

-  A 

2 

(-1) 

rfA(.t) 

(  exp 

(  1  ) 

a.s  (1) 


For  the  special  case  where  X  has  a  density  given  by 
X/'Jiitxt''  cxp(  -  x'/2fT-),  wc  see  that  £)  A’ Vj,]  = 
ScT'/d  A-  Acr-)a.s.  Furthermore,  £[.V  y,)=  V’rr’/lfr,’  -►rr-ta  s 
Thus,  we  sec  that  in  this  situation. 


In  other  words,  £[A'|Vj,-  •  -  .V',,)  is  equal  a  s.  to  a  Borel  measur¬ 
able  function  of  the  £(A'iV']'.s  for  this  case  when  A'  is  Gaussian 
One  might  ask  if  such  a  result  holds  for  any  tUher  distributions 
on  X.  The  following  theorem  addresses  this  question  and  shows 
that,  in  the  context  of  the  previous  assumptions,  almost  surely 
exact  fusion  is  always  possible  for  am  second-order  random 
variable  X. 


Theorem  I:  Consider  a  probability  space  (12,  >'.  £)  and  ran¬ 
dom  variables  X,N,,  -  ■  N„  defined  on  {ll.  Y.  P)  where  n  is  a 
positive  integer  and  A"  is  a  second-order  random  variable 
Further,  assume  that  for  each  positive  integer  Is'i.S]  has 
a  zero-mean  Gaussian  distribution  with  positive  variance  given 
by  fT,",  and  that  X,M^,---,S„  are  mutually  independent 
Define  Y,  =  X  +  N,  for  i  =  !,•  ,«•  Then  there  exists  a  Bore! 

measurable  function  g:  If” -*  such  that  FfA'Tj,-  ,y„l  = 
g(£{Ariy,].  ',£[A'iyj)a.s. 

Proof:  If  A'  is  a  s.  equal  to  a  conslanl.  the  result  is  obvious. 
Assume  that  X  is  not  almost  suiely  equal  to  a  consi.mt  I  sing 
(1)  it  immediately  follows  that  a  version  of  the  regression  (unc 
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tion  ff.Viy,  =  >■]  is  given  by 


f 

/  -  T-  yx  ' 

1  Of  exp 

(  "  <7  = 

r  i 

-  x~  yx  \ 

dFyix) 

2(r^  a;-  j 

This  version  will  be  used  throughout  the  remainder  of  the  proof. 
It  now  follows  that 


an  example  was  presented  that  illustrates  the  utility  of  this 
result. 
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d 

dy 


£(Ariy,  =  y] 


-  X  ’  -f  2  xy 

-  X  '  -F  2x> 

(  X 

•  X  -  4  2  .n 

■ 

r 

dF^{x)f  — ccxp 

dF^{x)- 

/  —  exp 

•'*  <7, 

dl\  (  x  ) 

2<r,’ 

2.7,- 

2.7,- 

i  exp 

-x=-f2x)  ■ 

2.7,- 

Notice  that  the  denominator  of  this  expression  is  positive.  Fur¬ 
ther,  the  Schwarz  inequality,  which  is  a  strict  inequality  since  ,V 
is  not  a.s.  equal  to  a  constant,  implies  that  the  numerator  is  also 
positive.  Thus,  since  d/c/y£[A'|y,  =  y]  >  0,  we  see  that 
£lA'iy,  =  y]  is  a  strictly  increasing  function  of  y.  Hence,  there 
exists  a  Bore!  measurable  function  g,  so  that  g,(£[  A'|y,))=  y, 
a.s.  Notice  that 

E  A«,(£[-T|y,])  a.s. 

Thus,  substitution  of  this  expression  for  5  into  (1)  provides  a 
Borel  measurable  function  g:  R"  R  such  that 

□ 


Hence,  Theorem  1  shows  that  almost  surely  exact  fusion  in 
the  setting  under  consideration  is  always  possible.  Notice  again 
that  this  result  holds  for  any  second-order  random  variable  of 
interest,  We  next  present  an  example  which  serves  to  illustrate 
the  utility  of  Theorem  1. 

Example  5:  In  the  context  of  Theorem  1,  let  A' =  1  with 
probability  one  half  and  let  X  =  —  I  with  probability  one  half. 
Then  it  straightforwardly  follows  that  a  version  of  £(A'|y,  =  y) 
is  given  by  tanh(y /rr,*).  Now,  fixing  this  version  and  adopt¬ 
ing  the  notation  used  earlier  in  this  section,  we  have  that 
5  =  E7- 1  f3nh”  ’  (£{ ATIV]])  a.s.  Further,  (1)  simplifies  to 
£[A’|y,,-  •  ■,y„]=  tanh(5)  a.s.  Hence,  we  see  that 


£(Ar|y,.--,yJ  =  tanh 


z  tanh-'(£{ Any;]) 

I  - 1 


Thus,  in  this  case  we  have,  as  guaranteed  by  Theorem  1,  a 
precise  expression  where  £IA'|y,, •• -.y^]  is  equal  a.s.  to  a 
specific  Borel  measurable  transformation  of  the  £{Ar|y,]’s. 


IV.  Conclusion 

We  considered  the  problem  of  fusion  in  estimation  theory. 
We  presented  several  examples,  using  common  distributions,  in 
which  virtually  any  method  of  fusion  would  be  useless  in  approx¬ 
imating  the  random  variable  of  interest.  Further,  we  presented  a 
theorem  which,  for  a  very  general  situation,  shows  that  fusion 
resulting  in  an  almost  surely  exact  approximation  is  always 
pos.siblc.  In  particular,  this  result  addressed  the  situation  in 
which  the  data  consisted  of  the  random  variable  of  interest 
corrupted  by  additive  Gaussian  noise  and  the  random  variable 
of  interest  could  be  any  second-order  random  variable.  Finally, 
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B.  Optimal  Estimation  with  Respect  to  a  Large  Family  of  Cost 
Functions 

Our  results  pertaining  to  optimal  estimation  are  found  in 
Appendix  B  in  the  article  entitled,  "On  optimal  estimation  with 
respect  to  a  large  family  of  cost  functions,"  which  appeared  in  the 
May  1991  issue  of  IEEE  Transactions  on  Information  Theory.  In  this 
article  we  considered  the  problem  of  optimal  estimation  of  a  random 
variable  X  based  on  an  observation  denoted  by  a  random  vector  Y.  We 
gave  a  mild  restriction  on  the  regular  conditional  distribution 
function  of  X  given  a(Y)  that  ensures  that  E[<I>(X-g(Y))]  is  minimized 
for  any  cost  function  0  that  is  nonnegative,  even,  and  convex.  We 
showed  that  given  any  real  valued  Borel  measurable  function  there 
exist  random  variables  X  and  Y,  possessing  a  joint  density  function, 
so  that  the  chosen  function  is  an  optimal  estimator,  with  respect  to 
any  of  the  cost  functions  previously  described,  of  the  random 
variable  X  based  on  the  random  variable  Y.  The  results  were  then 
extended  to  estimation  of  X  based  upon  a  random  variable  that  is 
measurable  with  respect  to  any  given  a-subalgebra. 


On  Optimal  Estimation  with  Respect  to  a  Large 
Family  of  Cost  Functions 

Eric  B.  Hall  and  Gary  L.  Wise 


/4Ai/raKy— Consider  two  random  variables  X  and  Y.  A  commonly 
encountered  problem  involves  estimating  X  via  >>(y)  so  as  to  minimize 
EldKA*  -  A(y))]  where  h  is  Borel  measurable  and  <t>  is  a  Borel  measur¬ 
able  cost  ftinction  chosen  to  adequately  reflect  the  fldelity  demands  of 
the  problem  under  consideration.  This  correspondence  places  a  mild 
condition  on  the  regular  conditional  distribution  of  X  given  <r(  Y )  that 
ensures  that  £’(<)>(  A*  -  h(y))]  is  minimized  for  any  cost  ftinction  4>  that 
is  nonnegative,  even,  and  convex.  In  addition,  it  is  shown  that  given  any 
Borel  measurable  ftinction  g-  fl~*  R,  there  exist  random  variables  X 
and  y  possessing  a  joint  density  ftinction  such  that  £1  A'ly  ■■  y]  -  ;<>  ) 
a-e.,  with  respect  to  Lebesgue  measure. 

Index  Tenns— Optimal  nonlinear  estimation,  non-mean-sqoare>error 
fidelity  criteria,  regression  functions. 


I.  Introduction 

In  this  correspondence  we  consider  the  problem  of  estimation 
with  respect  to  nontraditional  cost  functions.  In  an  estimation 
problem  one  is  often  confronted  with  two  concerns  in  choosing  a 
cost  function;  the  concern  that  the  cost  function  adequately 
reflects  the  cost  one  wishes  to  associate  with  an  enor,  and  the 
concern  that  the  cost  function  results  in  a  problem  which  one 
Hnds  to  be  mathematically  tractable.  Traditional  cost  functions, 
such  as  the  quadratic  cost  function  that  is  associated  with  the 
extremely  popular  mean-square-error  criterion,  are  usually  cho¬ 
sen  solely  on  the  basis  of  the  second  of  these  two  concerns.  As  a 
result,  the  fidelity  demands  of  the  specific  problem  under  con¬ 
sideration  are  rarely  relied  upon,  and,  in  fact,  are  often  not  even 
considered,  when  determining  the  cost  function  which  will  be 
used.  This  sacrifice  of  suitability  for  mathematical  ease  in  the 
choice  of  a  cost  function  should  be  the  cause  of  some  concern 
since  the  traditional  choices  are  unsuitable  for  many  problems 
in  estimation.  This  correspondence  lessens  this  problem  by 
extending  the  domain  of  mathematical  tractability  to  include 
many  cost  functions  that,  even  though  pertinent  to  the  subjec¬ 
tive  demands  of  many  problems,  have  in  the  past  been  excluded 
from  consideration. 
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11.  Development 

In  1958,  Seymour  Sherman  published  a  paper  entitled  “Non- 
Mean-Square  Error  Criteria"  (1]  in  which  he  proposed  condi¬ 
tions  on  a  conditional  distribution  that  would  allow  for  the 
simultaneous  minimization  of  a  large  family  of  cost  functions.  In 
j2]  we  provided  a  proof  of  Sherman’s  proposal  and  explored 
several  extensions  and  practical  consequences  Although  Sher¬ 
man's  result  had  been  widely  quoted  prior  to  (2],  a  correct  proof 
seems  to  have  been  elusive.  For  example,  several  proofs  [3,  pp 
308-310];  (4,  pp.  10-12];  {5,  p.  61]  using  integration  by  parts 
were  attempted  even  though  the  conditions  placed  on  the  cost 
function  were  insufficient  to  allow  such  a  method  to  be  used.  (In 
particular,  the  proofs  referenced  above  may  fail  for  X  and  Y 
mutually  Gaussian  if  a  cost  function  <t>;  #f-*[0,*)  is  chosen 
which  is  even,  strictly  increasing  on  [0,*).  and  singular.  For  an 
explicit  counterexample  the  interested  reader  is  referred  to  [6].) 

For  a  topological  space  T,  we  will  let  ^(T)  denote  the  Borel 
subsets  of  T;  and  I^{-)  will  denote  the  indicator  function  of  the 
set  S.  Let  S  denote  the  set  of  positive  integers.  Also,  recall  that 
a  probability  distribution  function  F:  ^-*[0,1]  is  said  to  be 
unimodal  about  y  e  R  if  F  is  convex  on  ( -  *,y)  and  concave  on 
(y.x),  and  a  probability  distribution  function  F:  if-»[0,l]  is 
said  to  be  symmetric  if  for  all  realx,F(x)=  1  -lim^ ,  „  F(  -  x  -  /?). 
If  k  is  a  positive  integer  and  F,, ■  •  •.y*  are  k  random  variables 
defined  on  a  common  probability  space,  the  random  vector 
y  “IT,.  -  •  induces  a  probability  measure  on  we 

will  denote  this  resulting  measure,  conventionally  known  as  the 
distribution  of  y.  by  the  notation  fiy  Finally,  we  recall  that  for 
a  random  variable  X  and  a  o-subalgebra  JiY.  a  regular  condi¬ 
tional  distribution  function  F;  /fxn-»|0, 1]  always  exists  [7, 
pp.  263-264];  such  a  function  is  characterized  by  the  following 
two  conditions:  for  each  <i>  e  n,  Ff  -.w)  is  a  probability  distribu¬ 
tion  function,  and,  for  each  x  e  R,  F(x,tu)- FiA"  <  xt.Sf'Xtj) 
a.s. 

Sherman’s  original  proposal  (generalized  and  proved  in  (2]) 
required  a  regular  conditional  distribution  function  that,  when 
properly  shifted,  is  symmetric  and  unimodal  about  the  origin 
and  a  cost  function  that  is  nonnegative,  even,  and  nondecreasing 
to  the  right  of  the  origin.  For  mutually  Gaussian  random  vari¬ 
ables  X  and  y  it  follows  easily  that  the  resulting  regular 
conditional  distribution  function  is  symmetric  and  unimodal 
about  F{A'iyK<v)  for  any  fixed  &>.  This  special  case  explains 
why  Sherman's  result  is  often  invoked  to  add  a  flavor  of  general¬ 
ity  to  papers  that  consider  Gaussian  distributions.  When  one 
attempts  to  venture  outside  this  somewhat  limited  arena,  how¬ 
ever,  the  conditions  which  Sherman  placed  on  a  regular  condi¬ 
tional  distribution  function  immediately  begin  to  feel  overly 
restrictive.  The  conditions  on  the  cos:  function,  however,  arc 
extremely  nonrestrictive  and,  in  fact,  allow  for  many  interesting, 
albeit  impractical,  choices.  This  imbalance  suggests  the  possibil¬ 
ity  of  lessening  the  restrictions  on  the  regular  conditional  distri¬ 
bution  function  by  perhaps  slightly  increasing  the  restrictions 
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imposed  on  the  cost  function.  The  following  lemma  will  allow  us 
to  present  such  a  result. 

Lemma  1:  Assume  that  F  is  a  symmetric  probability  distribu¬ 
tion  function  and  that  4>:  R  -+(0,*)  is  even  and  convex.  Then 

j ^{x)dF{x)  <  j <i>{x  -  a)dF(x),  for  all  ae  R. 

Proof:  Since  <I>  is  convex  we  see  that  <l>( x )  <  ( 1  x  -  a) 
+  (l/2)<t>(x  -f  a).  Further,  since  F  is  symmetric  and  <t>  is  even, 
we  see  that 

b}  <t’(x -F  a)dF(x)=  f  4>(x  -  a)df(x). 

*  ■'R 

Thus,  we  see  that 

fjt>{x)dF(x)  <  -  j^<t>(x  -  a)dF(x)+  -  j^<t>(x  +  a)dF(x) 

^  ji>{x-a)dF{x).  O 

Lemma  1  will  allow  a  result  similar  to  that  given  in  [1]  to  be 
stated  for  a  much  less  restrictive  family  of  regular  conditional 
distribution  functions  by  slightly  restricting  the  family  of  allow¬ 
able  cost  functions.  In  particular,  we  will  be  able  to  drop  the 
restriction  that  the  conditional  distribution  function  be  uni- 
modal  by  requiring  that  the  cost  function,  in  addition  to  the 
previous  restrictions,  also  be  convex.  Notice  that  requiring  the 
cost  function  to  be  even  and  convex  implies  that  it  is  also 
nondecreasing  to  the  right  of  the  origin. 

We  are  now  in  a  position  to  state  and  prove  the  following 
result 

Theorem  2:  Let  A:  e  N,  (fl,./,  P)  be  a  probability  space,  and 
Jf.y,,-  ■  '.y*  be  random  variables  defined  on  P),  with  X 

integrable.  Let  M:  R’' -*  R  be  a  Borel  measurable  function 
such  that  ■  ■,y*(«u))=  ElX\Yf,-  ■  -.y^Kw)  a.s.,  and  let 

F:  ^  X  fl  -*(0,  ll  be  a  regular  conditional  distribution  function 
of  X  conditioned  on  tr(y,,  -  •  -  .y^)  such  that 

F(x  +  Af(y,(a,).  ■•,y*(a,)).a,), 

as  a  function  of  x  with  ea  fixed,  is  symmetric.  Then  M  mini¬ 
mizes  the  quantity  FldtlA"-  f(Y^,' ■  nT*)  ]  over  all  Borel  mea¬ 
surable  functions  /:  R''  ->  R  where  4>:  is  even  and 

convex. 

Proof:  Lemma  1  and  a  change  of  variables  imply  that  for 
each  fixed  w  and  for  a  e  /f, 

/^<l>(x-Af(y,{w),  •■,n(w)))dF(x.a,) 

S  j^{x  -  a  ~  M{Y,(u,),-  ■  ■  ,Y,{m)))dF{x,w). 

Let  g:  if  *  -►  be  a  Bore!  measurable  function  by  which  X  is  to 
be  estimated  and  E[<t>(X  -  g(Yi,- ■  ■,Y^))]  minimized  Note  that 

£[<!>(  jf-g(y„--,y*))] 

-  f  (£[<!>(  ;ir  -  g(  y,.- ••  ,n  ))icr(  y„- .n )]] . 


From  the  preceding  inequality  and  (8,  p  79],  the  inner  expecta¬ 
tion,  and  thus  this  expression,  is  minimized  when  «(>',.  .y^i 

-Af(y„- ■■.>;).  c 

We  will  next  present  a  useful  corollary  to  Theorem  2  Ixt 
k^N.  (fl,  ^Y\P)  be  a  probability  space,  A'  be  a  random  vari¬ 
able  defined  on  and  y  be  a  random  vector  defined 

on  (f},,y. F)  taking  values  in  If*.  Recall  that  FixlY  =  >  >  is  said 
to  be  a  regular  conditional  distribution  function  for  A'  given 
y  =  >’  if  for  each  fixed  y  e  if*.  £(x|y  «  y)  is  a  probabiliiv  dis 
tribution  function  as  a  function  of  x,  and  for  each  fixed 
X  e  if.  £(x|y>^y)  is  a  version  of  the  regression  function 
£(/, _,  ,|(A')iy  =  >■].  Further,  recall  that  a  regular  conditional 
distribution  function  for  X  given  y  always  exists  [cf.  9,  pp 
372-376].  The  next  corollary,  which  follows  straightforwardly 
from  Theorem  2,  removes  the  need  to  work  on  the  underlying 
probability  space. 

Corollary  3:  Let  k  £  S,  iil.^.P)  be  a  probability  space.  A’ 
be  an  integrable  random  variable  defined  on  (fl.-A  F),  >’  be  a 
random  vector  defined  on  (ft, .y, F)  taking  values  in  F*.  and 
M:  F*  -+  F  be  any  Borel  measurable  function  equal  a  e.  (uy  ]  to 
£[A'|y=y].  Further,  assume  that,  as  a  function  of  jt  with  y 
fixed,  a  regular  conditional  distribution  function  of  X  given 
y  =  y,  denoted  by  FixW  =  y).  is  such  that 

F(x  +  A/(y)iy=y) 
is  symmetric.  Then  g  •  Af  minimizes 

£[<i>(Ar-g{y))] 

over  all  Borel  measurable  functions  g  :  R*  -*  F  where  <t>  ,  F  -• 
(0,*)  is  even  and  convex. 

III.  A  Non-Gaussian  Application 

The  following  example  illustrates  the  usefulness  of  Theorem  2 
and  Corollary  3  and  shows  how  these  results  may  be  applied  to 
non-Gaussian  distributions.  In  particular,  given  arty  real  valued 
Borel  measurable  function  g(  ),  we  show  that  there  exists  a 
random  variable  X  and  a  random  variable  Y,  possessing  a  joint 
density  function,  so  that  £(<t'(A' -  AfT))]  is  minimized  when 
M')=g(  )  for  any  cost  function  <I>  that  is  nonnegative,  even, 
and  convex. 

Example:  Let  g:  F-*  F  be  Borel  measurable  and  define 
/( -t. y )  ■=  g  exp (  - exp(| y l)|Ar  -  g( y ) -(-  F I) 

1 

+  -exp{-exp{ly!);x  -  g(y)-  K\). 

where  K  is  some  fixed  real  number.  Note  that  fix.y)  is  a  joint 
probability  density  function  since 

fjj(  x,y)dxdy~  exp  (  -  exp  ( I  y  I)  i  JT  -  g  ( v )  -^  A'  I ) 

1 

+  -exp(  -cxp(tyl)lT  -  g(  y)  -  K\)dxd\ 

=  ff -exp(-exp(lyDU'i)drdy 

r  1 

“  /  -  iyl)^y  »  i. 
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Let  X  and  Y  be  random  variables  such  that  the  pair 
has  a  joint  density  function  given  by  f(x,y).  Notice  from  the 
above  calculation  that  a  second  marginal  density  of  f(x.y)  is 
given  by  /y(y)  =  3exp(- lyl).  Also,  notice  that  fix  +  g(>),y), 
as  a  function  of  x  svilh  y  fixed,  is  even.  Recall  that  a  version  of 
E[X\Y »  >1  is  given  by  l^xfix,  y)/ fy{y)dx.  Fixing  this  version 
and  using  this  expression  for  /y(  .v)  implies  that 

-2exp(!y!)  r  -(exp(-exp(iyi)!x-  g(y)+  Kl) 

Jjt  O 

+  exp(-exp(|>i)ix  -  Al))dx 

=  2exp{i.v|)/^((7  +  g(y)-£) 

+  (^  +  g(y)+  A:))Uxp(  -exp(lyDi7i)d7 

-  2  e  xp  ( I  y  1 )  (  2  ( g  ( >■ ) )  +  A' -  A )  =  g  ( y ) . 

Since  fix,  y),  as  a  function  of  x  for  y  fixed,  is  even  about  g(> ), 
it  follows  that  the  conditional  density  function  fix,y)/ fyiy) 
shares  this  same  property.  Thus,  it  is  easy  to  see  that  the 
associated  regular  conditional  distribution  function,  when  prop¬ 
erly  shifted,  is  symmetric  (and  not  unimodal  if  A  0). 

Corollary  3  may  thus  be  applied  to  see  that  >?(•)=  g(  ),  which 
we  recall  was  an  arbitrary  Borel  measurable  function,  minimizes 
£[<!>(  A  -  /t(y))]  over  all  Borel  measurable  functions  h:  R  R 
where  <b;  R  -vlO,*!  is  even  and  convex.  Notice  that  this  exam¬ 
ple  illustrates  the  applicability  of  Theorem  2  in  a  situation 
where  Sherman’s  result  would  not  apply,  and,  in  addition,  it 
demonstrates  the  applicability  of  these  results  to  non-Gaussian 
distributions.  Notice  further  that  this  example  also  points  out 
that  the  existence  of  a  joint  density  function  in  no  way  guaran¬ 
tees  that  a  regression  function  will  obey  any  regularity  property 
other  than  Borel  measurability. 

IV.  The  General  Case 

The  preceding  development  was  concerned  with  an  attempt  to 
estimate  the  integrable  random  variable  X  based  on  a  Borel 
measurable  function  of  the  random  variables  F,,  -  where  k 
is  a  positive  integer,  in  this  case,  our  estimate  was  a 
erfy,,- • y* j-measurable  random  variable.  Notice  that  it 
straightforwardly  follows  that  <r(y,,-  ■  ^Y),)  is  countably  gener¬ 
ated  since  is  countably  generated.  In  many  cases,  we 

might  wish  to  estimate  A  by  a  random  va'-iable  that  is  measur¬ 
able  with  respect  to  some  other  cr-algebra.  For  example,  con¬ 
sider  the  case  where  {F,;  r  e  A*)  is  a  random  field  and  we  wish 
to  estimate  A  via  a  random  variable  that  is  <r({y,;  r  e  A*))- 
measurable.  Note  that  this  o-algebra  need  not  be  countably 
generated.  Also,  consider  the  case  where  //  is  a  real,  separable 
Hilbert  space  and  Z  is  an  H-valued  random  variable;  here  we 
might  wish  to  estimate  A  via  a  tr(Z)-measurable  random  vari¬ 
able.  In  the  general  case,  Z  could  be  a  random  object;  that  is.  a 
random  variable  taking  values  in  a  measurable  space  (G,,/), 
and  wc  would  be  interested  in  estimating  A  via  a  random 
variable  which  is  measurable  with  respect  to  oIZl-  Z 

The  following  theorem  addre.sses  the  estimation  of  A  via  a 
random  variable  which  is  measurable  with  respect  to  any  a-sub- 
algebra  of 


Theoren:  4:  Lei  (Q,.y'.P)  be  a  probability  spate.  :/  be  a 
o-subalgebra  of  ,  and  A  be  a  random  vanab<';  defined  on 
such  that  A  is  integrable  For  each  wefl.  let 
Mi  <v) *  £(  Ai.r/)((a).  and  assume  that  there  exlst^  a  regular 
conditional  disiribution  function  of  A  conditioned  on  ,:/  f 
A  X  fi-*{0,ll,  such  that  Fix  +  Mi  lul.ea).  as  a  function  of  x 
with  (u  fixed,  is  symmetric.  Then  M  minimizes  the  quanniy 
£l'It(  A  -  A)] over  all  -a^measurable  random  variables  .V.  where 
4>:  A  -♦(0.*)  is  even  and  convex. 

Proof  Lemma  1  and  a  change  of  variables  imply  that  for 
each  fixed  oj  and  for  q  e  A. 

f  <t>(x  -  M{w))dF{x,oj)  <  (  <i>{x  -  a  -  M{u}))  dF{  x.ui) 

'R  •'r 

Let  A  be  an  .a^measurable  random  variable  by  which  X  is  to 
be  estimated  and  EftPl  A  -  A)]  minimized.  Note  that 

£[<J>(A- A)]  =£{£{4>{A- A):/]]. 

From  the  preceding  inequality  and  [8.  p.  79).  the  inner  expecta¬ 
tion,  and  thus  the  above  expression,  is  minimized  when  A  =  M. 


V.  CONCLUSIO.N 

In  this  correspondence  we  have  considered  the  problem  of 
optimal  estimation  of  a  random  variable  A  based  on  art  obser¬ 
vation  denoted  by  a  random  vector  F.  We  have  given  a  mild 
restriction  on  the  regular  conditional  distribution  function  of  A' 
given  ffiY)  that  ensures  that  £(d>(A'  -  glF))]  is  minimized  for 
any  cost  function  <I>  that  is  nonnegativc.  even,  and  convex. 
Further,  we  have  shown  that  given  any  real  valued  Borel  mea¬ 
surable  function  there  exist  random  variables  A  and  possess¬ 
ing  a  joint  density  function,  so  that  the  chosen  function  is  an 
optimal  estimator,  with  respect  to  any  of  the  cost  functions 
previously  described,  of  the  random  variable  A'  based  on  the 
random  variable  F.  The  results  were  then  extended  to  estima¬ 
tion  of  A  based  upon  a  random  variable  that  is  measurable  with 
respect  to  any  given  cr-subalgebra. 
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C.  Decentralized  Estimation 


With  a  reasonable  effort  one  may  combine  the  results  in 
Appendices  A  and  B  to  result  in  methods  of  fusing  best  estimates 
where  the  field  observers  estimate  under  different  fidelity  criteria 
For  example,  we  presented  results  of  this  nature  in  Appendix  C 
which  appeared  in  the  paper  entitled,  "Decentralized  estimation  with 
nontraditional  fidelity  criteria  and  corrupted  estimates,"  which 
appeared  in  the  Proceedings  of  the  Twenty-Sixth  Annual  Conference 
on  Information  Sciences  and  Systems. 

In  the  context  of  decentralized  estimation,  there  is  a  need  to 
efficiently  and  effectively  ptocess  the  estimates  provided  by 
multiple  sensors.  It  is  this  problem  of  how  best  to  fuse  the 
separate  estimates  that  is  the  Cosence  of  decentralized  estimation 
This  paper  was  concerned  with  decentralized  estimates  when  the 
estimates  provided  by  the  various  sensors  were  corrupted  by  noise 
during  transmission  to  the  central  processor  and  when  different 
fidelity  criteria  were  used  by  the  different  sensors.  Decentralized 
techniques  arise  naturally  in  a  number  of  diverse  applications  such 
as  radar  tracking,  fault  tolerance,  two-way  communications,  highly 
redundant  sensor  systems,  image  processing,  impact  point 
prediction,  moving  source  location,  map  updating  in  oceanography  or 
meteorology,  multiple  sensor  navigation  systems,  surveillance  and 
search  systems,  underwater  acoustic  telemetry,  power  systems, 
object  recognition,  and  communications  between  subsystems  along 
unreliable  or  limited  channels.  Decentralized  procedures  promise 
many  advantages  over  their  centralized  counterparts.  For  example, 
they  may  offer  increased  system  reliability  and  fault  tolerance, 
increased  immunity  and  resistance  to  noise  and  jamming,  increased 
accuracy,  increased  data  compression  and  rate  reduction,  increased 
isolation  and  recovery  capability,  a  parallel  structure  'seful  when 
processing  a  large  volume  of  information,  increased  processing 
speed,  increased  computational  efficiency,  increased  coverage,  and 
an  incf'ease  in  the  overall  robustness  of  the  system.  In  this  paper  it 
is  shown,  in  contrast  to  previous  results,  that  in  a  general  additive 
Gaussian  noise  setting  a  decentralized  procedure  may  produce  the 
same  estimate  as  a  centralized  procedure  without  the  need  for  any 
intersensor  communication. 
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Abstract 

Decentralized  estimation  techniques  are  proposed 
based  upon  models  that  allow  different  nontraditional  cost 
functions  to  be  employed  by  each  sensor  and  allow  for  noise 
to  exist  between  the  sensors  and  the  central  processor. 

Introduction 

In  the  context  of  decentralized  estimation,  there  is  a 
need  to  efficiently  and  effectively  process  the  estimates  pro- 
tdded  by  multiple  sensors.  Indeed,  it  is  this  problem  of 
how  best  to  combine  or  fuse  the  separate  estimates  that  is 
the  essence  cf  decentralized  estimation.  This  paper  is  con¬ 
cerned  with  decentralized  estimation  when  the  estimates 
provided  by  the  various  sensors  are  corrupted  by  noise  dur¬ 
ing  transmission  to  the  central  processor  and  when  different 
fidelity  criteria  are  used  by  the  different  sensors. 

Decentralized  techniques  arise  naturally  in  a  number 
of  diverse  applications.  For  example,  decentralized  tech¬ 
niques  have  been  proposed  in  the  areas  of  radar  tracking, 
fault  tolerance,  two-way  communications ,  highly  redundant 
sensor  systems,  image  processing,  impact  point  prediction, 
moving  source  location,  map  updating  in  oceanography  or 
meteorology,  multiple  sensor  navigation  systems,  surveil¬ 
lance  and  search  systems,  underwater  acoustic  telemetry. 


power  systems,  object  recognition,  and  communicatioas  be¬ 
tween  subsystems  along  unrehable  or  limited  channels 

Decentralized  procedures  promise  many  advantages 
over  their  centralized  counterparts.  For  example,  decen¬ 
tralized  procedures  may  offer  increased  system  rehabihty 
and  fault  t'-lerance.  increased  immunity  and  reastance  to 
noise  and  jamming,  increased  accuracy,  increased  data  com¬ 
pression  and  rate  reduction,  increased  isolation  and  recov- 
erj'  capability,  a  parallel  structure  useful  when  processmg 
a  large  volume  of  information,  increased  processing  speed, 
increased  computational  efficiency,  increased  coverage,  and 
an  increase  in  the  overall  robustness  of  the  system. 

Although  much  has  been  written  on  the  subject  of 
decentralized  estimation  and  data  fusion,  many  important 
questions  remain  unanswered.  Further,  when  answers  have 
appeared  they  have  often  been  incorrect  or  misleading  For 
example,  is  it  true  (as  intuition  might  suggest)  that  a  decen¬ 
tralized  procedure  can  produce  the  same  optimal  estimate 
produced  by  a  centralized  procedure  if  and  only  if  the  sen¬ 
sors  are  allowed  to  communicate  with  each  other’  In  (l) 
it  is  fiatly  stated  that  for  a  decentralized  estimation  struc¬ 
ture  to  be  effective  the  local  sensors  must  communicsle 
with  each  other,  and  in  [2]  a  decentralized  pr  icedure  has 
been  prt^osed  based  upon  additive  noise  and  intersensof 
communication  that  provided  the  same  estimate  as  a  ceo- 
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tralued  procedure.  In  [3],  however,  it  is  shown,  in  contrast 
to  the  previous  results,  that  in  a  general  additive  Gaussian 
noise  setting  a  decentralized  procedure  may  produce  the 
same  estimate  as  a  centralized  procedure  without  the  need 
for  any  intersensor  communication. 

Data  Fusion 

In  [3]  the  problem  of  decentralized  estimation  was 
considered  in  a  general  setting.  In  particular,  it  consid¬ 
ered  the  problem  of  estimating  a  fixed  second  order  ran¬ 
dom  variable  X  defined  on  a  probability  space  (fi,  ,?■,  P) 
via  a  combination  of  estimates  of  the  form  E[X  |  Y-,}  where 
I  S  i  n  and  Yi,  ...  ,  Yn  are  random  variables  also  de¬ 
fined  on  (fl,  T,  P).  That  is,  [3]  considered  the  case  in 
which  the  central  processor  was  provided  by  each  sensor 
with  a  best  mean  square  estimate  of  X  as  a  Borel  mea¬ 


sonable  rules  for  data  fusion  in  such  a  genera!  setting  In 
the  next  section  we  extend  this  example  to  include  many 
non-Gaussian  distributions. 


A  Counterexample 


As  in  [4],  let  n  >  2  be  an  integer  and  consider  ran¬ 
dom  variables  {Xj.  ...  .Xu}  that  have  a  joint  probability 
density  function  f  :  R“  — *  R  of  the  form 


f(xi,  ,  x„)  = 


:  X,  e  R 


where  f;R  — »  R  is  a  standard  Gaussian  density  function 
It  follows  that  the  random  variables  in  {Xi,  ...  ,  Xn}  each 
possess  a  standard  Gaussian  distribution.  Further,  it  fol¬ 
lows  that,  whereas  the  random  variables  in  {Xj . Xn} 

are  not  mutually  independent,  the  random  variables  in  any 
proper  subset  of  {Xi,  ...  ,  Xn}  are  mutually  independent. 


surable  transformation  of  the  observation  Yj .  Focusing  at¬ 
tention  on  Borel  measurable  transformations  of  the  data 
it  follows  that  the  central  processor  must  find  a  way  of 
approximating  E[X  |  Yi,  . . .  ,  Yn]  based  on  random  vari¬ 
ables  from  the  set  {E[X|Yi],  ...  ,  E{X|Yn]}.  (That  is,  it 
must  approximate  the  orthogonal  projection  of  X  onto 
LzCn,  <r(Yi, ,  Yn),  P)  using  the  orthogonal  projections 
of  X  onto  L2(fl,  <r(Yi),  P),  . . .  ,  L2(fi,  o(Yn),  P).)  Thus, 
from  a  theoretical  perspective,  one  seeks  conditions  vmder 
■which  E[XlE[XlYi],  ...  ,  E[X|Ya]]  may  provide  a  good 
approximation  to  E[X(Yi, ...  ,  Yn].  Unfortunately,  posi¬ 
tive  results  to  this  question  are  elusive  in  many  common 
settings.  For  example,  (as  shown  in  [3])  even  if  X  is  a 
Gaussian  random  variable,  the  observations  { Yj , . , ,  ,  Yn  } 
are  mutually  Gaussian  random  variables,  and  the  problem 
expands  to  include  estimates  from  the  sensors  of  the  form 
E]X  1 P]  where  P  is  a  e- algebra  generated  by  any  nonempty 
proper  subset  of  the  observations  (that  is,  the  problem 
expands  to  allow  estimates  from  the  sensors  of  the  form 
E[X|Yi,,  ,  Yj^]  for  k  <  n),  it  stiU  might  not  be  pos¬ 
sible  to  provide  a  reasonable  estimate  for  X  based  on  the 
data  from  the  sensors.  Thus,  even  invoking  the  ubiquitous 
Gaussian  assumption  and  allowing  any  proper  subset  of  the 
sensors  to  commtmicate  may  not  be  enough  to  establish  rea¬ 


Next,  as  in  [5],  consider  a  density  function  g  for  which 
the  function  g  :  R®  R  defined  by 

■  B  1  r  a 

g(yj, ,  yn)  =  ]][g(yi)  l  +  JJyiKys)  ;  yi  €  R 
J  L 

is  nonnegative  and  integrates  to  1.  Let  n  >  2  be  an  integer. 
If  {Yi, . . .  ,  Yo}  are  random  variables  with  a  joint  density 
function  given  by  g  then,  paralleling  the  work  in  [4],  it  fol¬ 
lows  that  each  random  variable  in  { Yj ,  ...  ,  Y^  )  possesses  a 
density  given  by  g,  and  further  that,  although  the  random 
variables  in  {Yi,  . . .  ,  Ye)  are  not  mutually  independent, 
the  random  variables  in  any  proper  subset  of  {Yi ,  . . .  ,  Ya) 
are  mutually  independent.  Note  that  g(yi,  ...  ,  ya)  is  non¬ 
negative  and  integrates  to  1  if  lxg(x)J  <  1  for  al)  x  6  R  and 
if /Rx|*(x)d*  =  0 

Now,  let  f  be  any  probability  density  function  such  that 
[xf(x)|  <  1  for  all  X  €  R  and  such  that  xf*(x)dx  =  0,  let 
n  >  1  be  an  integer,  and  let  {X,  Yi ,  ...  ,  Ya }  be  random 
variables  possessing  a  joint  density  function  f  :  R”"*"'  — *  R 
of  the  form 

f{x,  yi, ... ,  ya)  = 

n  *  ■  o 

f(x)  flf(yi)  1  +  xf(x)  JJyif(yi)  :  x,  yj  €  R- 

i»l  J  I  i“l 


Since  the  conditional  expectation  of  X  given  any  proper 
subset  of  -fYi,  ...  ,  Yn}  is  almost  surely  zero,  it  follows  that 
any  attempt  to  estimate  X  via  best  Bore!  measurable  trans¬ 
formations  of  ramdom  variables  from  proper  subsets  of  the 
data  is  hopeless. 

Non-Mean-Square  Cost  Functions 

Now  we  direct  our  attention  toward  non-meaa-square 
cost  functions.  Given  two  random  variables  X  and  Y 
defined  on  a  common  probability  space,  a  frequently  en- 
cotintered  problem  in  estimation  theory  involves  finding  a 
function  g;R— *R  that  minimizes  E[$(X  —  g(Y))]  for  some 
cost  function  $:R  -♦  [0,  oo).  Generally,  one  is  confronted 
with  two  concerns  in  choosing  an  appropriate  cost  func¬ 
tion.  First,  one  is  concerned  that  the  cost  functicn  should 
adequately  reflect  the  cost  one  wishes  to  as‘>oc;ate  wdth  an 
error,  and  second,  one  is  concerned  that  the  cost  function 
should  result  in  a  problem  that  is  mathematically  tractable. 
Traditional  cost  functions,  such  as  the  popular  mean-square 
error  cost  function,  are  usually  chosen  solely  on  the  basis 
of  the  second  of  the  above  two  concerns.  As  a  result,  the 
fidelity  demands  of  the  specific  problem  under  considera¬ 
tion  are  rarely  relied  upon,  and,  in  fact,  are  often  not  even 
considered  when  determining  the  cost  function  that  will  be 
used. 

This  sacrifice  of  suitability  for  mathematical  ease  in  the 
choice  of  a  cost  function  should  be  the  cause  of  some  con¬ 
cern  since  the  traditional  choices  are  unsuitable  for  many 
problems  in  estimation.  As  an  example,  consider  the  prob¬ 
lem  of  estimating  the  position  of  a  projectile.  If  one  is  in¬ 
terested  in  shooting  down  the  projectile  then  a  small  e;ror 
in  the  estimate  of  its  position  may  not  result  in  a  penalty. 
If,  however,  the  error  in  the  estimate  is  such  that  the  pro¬ 
jectile  is  missed  then  the  penalty  might  suddenly  become 
enormous.  Further,  this  penalty  should  not  increase  if  the 
error  in  the  estimate  becomes  even  larger  since  the  result 
of  any  two  such  errors  is  the  same.  The  mean-square  error 
cost  function  applied  to  this  situation  penalizes  small  incon¬ 
sequential  errors,  assigns  almost  identical  costs  to  barely 
hitting  and  barely  missing  the  target,  and  assigns  a  larger 


cost  to  a  far-miss  than  to  a  near-miss  even  though  the  re¬ 
sult  in  each  case  is  the  same.  Clearly,  the  mean-square 
cost  function  is  not  a  very  good  choice  in  this  commonly 
encountered  situation.  Such  an  example  demonstrates  the 
great  need  to  extend  the  domain  of  mathematical  tractabil- 
ity  to  include  many  cost  functions  that,  though  pertinent 
to  the  subjective  demands  of  many  problems,  have  often 
been  excluded  from  consideration. 

In  {6],  general  conditions  were  given  allowing  foi  simul¬ 
taneous  use  of  any  cost  function  that  is  nonnegative,  even, 
and  nondecreasing  on  [0,  oo)  or  any  cost  function  that  is 
nonnegative,  even,  and  convex.  Further,  these  results  were 
applied  to  non-Gaussian  situations  and  extended  to  cover 
estimation  based  on  random  variables  measurable  with  re¬ 
spect  to  a  (T-algebra  generated  by  a  random  object.  In  the 
next  section  we  apply  these  results  to  the  area  of  decentral¬ 
ized  estimation. 


Non-Mean-Square  Fusion 

RecaU  that  a  probability  distribution  function  F  is 
symmetric  if  for  all  real  x,  F(x)  =  1  -  limhio  F(-x  -  h). 
The  following  result  follows  directly  from  Theorem  1  in  [3] 
and  Theorem  2^  [6]. 

Theorem  1;  Consider  a  probability  space  {fl,  .F,  P) 
and  random  variables  X,  Nj ,  . . .  ,  Ng  defined  on  (fl,  T,  P) 
where  n  is  a  positive  integer  and  X  is  a  second  order  ran¬ 
dom  variable.  Further,  assume  that  for  each  positive  inte¬ 
ger  i  <  n,  Ni  has  a  zero-mean  Gaussian  distribution  with 
positive  variance  given  by  af,  and  that  X,  Nj, . - . ,  Na  are 
mutually  independent.  Let  Yj  =  X  -I-  Nj  for  i  —  1,  ,  n 

and  assume  that  a  regular  conditional  distribution  func¬ 
tion  F;R  X  n  — *  (0,  1]  of  X  conditioned  on  <T(Yi)  exists 
such  that  F(x  E[X|yj](w),  w),  as  a  fimction  of  x  with 
w  fixed,  is  symmetric.  For  each  positive  integer  i  <  n, 
let  $i:R  — f  [0,  oo)  be  even  and  convex.  For  each  posi¬ 
tive  integer  i  <  n,  there  exists  a  Borel  measurable  function 
hi:R  —*  R  that  minimizes  E[#i(X  —  h(Yi))]  over  all  Borel 
measvirable  functions  h.  Further,  there  exists  a  Borel  mea- 
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surable  fxinctioa  g:R“  — *  R  such  that  E[X  |  Yi,  ...  >  Vn]  = 
g(hx{Y, h.(Y.))  a.s. 

Fusion  with  Inter-Sensor  Noise 

The  following  example  was  shown  in  [3]. 

Example:  Assume  that  all  of  the  random  \-ariables  in  this 
exsunple  are  defined  on  the  same  probabihty  space  denoted 
by  (fi,  T,  P).  For  each  positive  integer  i  <  n,  let  N,  be 
a  zero-mean  Gatissian  distribution  with  positive  variance 
given  by  <r*,  let  X  be  a  random  variable  such  that  P(X  = 
1)  =  P(X  =  —1)  =  assume  that  X,  Ni, . . .  ,  Nj  are 
mutually  independent,  and  let  Y;  =  X  -f  N;.  It  follows  from 
[3]  that  a  version  of  E[X  |  Yi  =  y]  is  given  by  tanh  y j 
and  that 

E[X|  Y, . Y,1  =  tanh  tanh->(E[X  1 Y;])^  a.s. 

To  begin,  in  the  context  of  the  previous  example,  note 
that  E[X|  Yi]  possesses  a  density  function  Ci:R  — »  R  of  the 


we  have 


«i(t)  =  di(of  tanh"\t));  |tl  <  1 


where 


Also,  note  that  (tanh(xi)  —  taah(x2)|  <  |xi  —  xj|  for  all 
*i»3t2  €  R  and,  for  0  <  a  <  1,  that  |tanh~*(xi)  — 
«ani»~Hx2)|  <  tiprlxi  -xj]  for  all  Xj,  X2  e  [-a,  a]. 

Consider  now  random  variables  Z] ,  . . .  ,  Zb  defined  on 
the  same  probability  space  as  above  such  that,  for  each  pos¬ 
itive  integer  i  <  n,  Zi  is  Gatissian  with  zero  mean  and  pos¬ 
itive  variance  s-f  and  such  that  X,  N] , . . .  ,  N.,  Zi ,  . . . ,  Za 
axe  mutually  independent.  Let  a  be  an  element  from 
(0, 1).  Further,  for  each  positive  integer  i  <  n,  let  A  be 
a  positive  number,  let  Wi  =  /?,E[X|Yi]  -f-  Zj,  let  Aj  = 
{«  e  n  :  <  q},  and  let  V,  =  Let 

Ci  =  {u;  6  ft  :  lE[X|Yi)|  <  a}  for  each  positive  integer 
i  <  n  and  note  that  P(Af)  and  P(C‘)  may  be  made  arbi¬ 
trarily  small  by  dioice  of  a  and  Next,  note  that  for  each 
fixed  positive  integer  i  <  n,  it  follows  that  for  w  e  Ci  n  A; 


tanh  '  1 Y,})  I  -taiihf^tanh~^(Vj) 


a  a 

<  ^tanh-'(E(X|Yi])-5]tanh-‘(V.)j 

i*!  i»2  j 

D 

<  Yi  |tanh-*{E[X  I  Y.j)  -  taah-‘(V, )] 

is] 

^r:^ZtlE[XlY.]-V,l] 

-MW): 


Next,  note  that,  for  any  positive  Integer  i  <  n  and  any 
positive  number  e  <  1, 

Thus,  given  any  positive  p  <  1  and  any  positive  <  <  1,  the 
upper  boimd  above  may  be  made  less  than  t  with  proba¬ 
bility  p  by  choice  of  a  and  i?,  for  i  =  1,  . . .  ,  n. 

Finally,  for  r  >  1,  note  from  above  that 

eJ  tanh^^tanh"HE(X|Yi])j  - 
tanh^^tanh“^(Vi)j  |  Ic.rvt, 

Also,  note  that 

E  |tanh^^taah"’(E[X|Yi])j  - 
tanb^y^  tanh~^(Vi)^  Ic  fUA'j 


<2'P(Cf  UAH- 

Thus,  it  follows  that  the  rth  mean  of  the  error  between  our 
estimate  and  a  best  Borel  measurable  mean  square  estimate 
of  X  without  inter-sensory  noise  may  be  made  arbitrarily 
small  by  choice  of  o  and  jSi  for  i  =  1,  . . .  ,  n. 
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D.  Multidimensionai  Quantization 


Results  on  multidimensional  quantization  are  presented  in  the 
paper  entitled,  "A  result  on  multidimensional  quantization,”  which 
will  appear  in  Proceedings  of  the  American  Mathematical  Society 
and  is  given  in  Appendix  D.  Multidimensional  quantization  often 
arises  in  an  effort  to  use  digital  processing  techniques  on  data, 
since  a  quantizer  is  literally  at  the  heart  of  analog  to  digital 
conversion.  In  this  appendix  we  show  that  a  popularly  used 
technique  for  designing  multidimensional  quantizers  fails 
spectacularly. 
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Aistract.  For  any  integer  N  >  1 ,  a  probability  space,  a  Gaussian  nndom 
veaor  X  defined  on  the  space  with  a  positive  definite  covariance  matrix,  and 
an  N-!evel  quantizer  Q  are  presented  such  that  the  random  vector  Q(,Y)  takes 
on  each  of  the  N  values  in  its  range  with  equal  probability  and  such  that  X 
and  Q{X)  are  independenc 


Introduction 

Quastization,  the  process  by  which  a  set  is  mapped  into  a  finite  subset  of  a 
given  cardinality,  plays  a  pivotal  role  in  virtually  any  application  that  requires 
analog  to  digital  conversion;  indeed,  it  is  at  the  heart  of  much  of  modem  digital 
technology.  In  such  applications,  a  quantizer  is  often  taken  to  be  a  function 
mapping  R*  into  a  subset  of  R^  of  cardinality  N,  where  k  is  a  oMitive  . 
integer  and  iV  is  an  integer  greater  than  one  (see,  e.g.,  (1,  5,  3,  6,  2.@p^).  In  n 
this  paper  we  present  what  might  be  a  surprising  consequence  of  suchla  general 
approach  to  quantization.  ^ 

Development  ^ 

For  a  topological  space  T,  we  will  let  SSiJT)  denote  the  family  of  Borel 
subsets  of  T .  For  a  set  S ,  we  will  let  P(5^  denote  the  power  set  of  S  and  1$ 
denote  the  indicator  function  of  5 .  By  a  standard  Gaussian  measure  we  will 
mean  a  Gaussian  measure  whose  first  moment  is  zero  and  whose  second  mo¬ 
ment  is  one.  Let  k  be  a  positive  integer.  Foranymeasme  m  on  (R*= .  .^(R*))^  - - 

we  will  let  m,  denote  the  inner  measure  on(R*,  P(R*))  induced  by  m  and 
we  will  let  m*  denote  the  outer  measure  on  (R* ,  P(R*))  induced  by  m .  Re¬ 
call  from  [4,  p.  61]  that  if  B  6  .^(R*)  and  A  e  P(R*^),  then  nt.{Br\A)  + 
m*{B  n  A‘)  =  m(B) .  will  let  X  denote  Lebesgue  measure  on  (R,  ^(R))  l/  • 
and,  for  integers  k>  we  will  let  A  denote  Lebesgue  measure  on  (R^ .^(R*))'',| 
where  k  will  be  determined  from  the  context.  Recall  that  for  a  measure  space 
(R* ,  ,  m) ,  a  subset  B  of  R*'  is  said  to  be  a  saturated  non-m-measurable 
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set  if  m,{S)  =  m.(S^)  =  0.  Finally,  a  /c-dimcasionai  quantizer  of  a  ran¬ 
dom  variable  X  defined  on  a  probability  space  (Q,  y ,  F)  is  any  function 
Q:R^  -*  F  such  that  F  is  a  finite  subset  of  ,  such  that  Q{x)  =  x  for  all 
jc  in  F  (i.e.,  such  that  Q  restricted  to  F  is  the  identity  map  on  F)  and  such 
that  Q{X)  is  itself  a  random  variable  defined  on  (fl,  y ,  F) .  If  F  is  a  finite 
subset  of  R*  with  cardinality  N  then  a  quantizer  Q:  F  of  a  random^ 

variable  X  is  said  to  be  an  JV-Ievel  quantizer.  — — — - - 

The  following  lemma  is  proved  in  (^C  pp.  381-382]. 

Lemma  1,  For  any  positive  integer  M  there  exist  M  disjoint  subsets  Z\,  ZiJ... , 
Zmt  of  the  real  line  such  that  Zi ,  Zi , . . . ,  ^  and  ^ch  ifxdb  Z  =  Zj  u  ■  •  vZm 
are  saturated  non-k-measurable  sets,  K  . 

The  next  result  is  an  immediate  consequence  of  Lemma  1.  pWf<Jse- 

Corollary  1.  For  any  integer  N  >  1  there  exist  N  subsets  T\,Ti . Tn  of 

the  real  line  that  partition  the  real  line  and  are  such  that  for  each  positive  integer 
j  <  N,  Tj  is  a  saturated  non-X-measurable  set. 

For  our  purposes  the  following  corollary  will  prove  useful. 

Corollary  2.  For  any  positive  integer  k  and  any  integer  N  >  I ,  there  exist  N 
subsets  Si ,  S2,  ,  Sn  of  R‘‘  that  partition  R*  and  are  such  that,  for  each 

positive  integer  J<N,  Sj  is  a  saturated  non- A-measurable  set. 

Proof.  For  =  1 ,  the  result  follows  from  Corollary  1.  J  'sume  A:  >  1 .  Let 
Ti, ...  ,Tfir  be  a  partition  of  the  real  line  as  given  by  Con  ;ary  1.  For  positive 
integers  j  <N,  let  =  r<  x  R  x  •  ■  •  x  R  c  R^ .  Fix  a  positive  integer  j  <N 
and  assume  that  there  existsQi  set  B  €  .^(R^such  that  B  c  Sj  and  A(F)  >  0 . 

Define  a  subset  F  of  R  as  follows:  C_ _ - — - 

B  =  €  R :  (ii ,  ib , .  •  • .  6k)  €  F  for  some  (62 , . . . ,  6*)  e  R^'"*} . 

^Recall  from  f7,  p.  161])that  B  e  ^(R) •  Further,  notice  that  X{B)  >  0  since 
FcFxRx--*xRcR*  and  A(fi)  >  0 .  But,  A(.5)  =  0  since  B  q  Tj 
and  X,{Tj)  =  0.  This  contradiction  implies  that  MB)  ■=  0  and  hence  that 
M{Sj)  =  0 .  It  follows  similarly  that  A.(5p  =  0  also.  Q.E.D. 

Lemma  2.  For  a  positive  integer  k  and  an  integer  N>  \ ,  let  S\,Si, ...  ,Sn 
comprise  a  partition  ofR’^  such  that  for  each  positive  integer  (n<  N ,  Sj  is  a 
saturated  non- A-measurable  set.  The  set  ^ - — 

y  =  {(Fi  r\Ai)\J  -‘\j{SNr\A.s)-.Ai^^{R}^)for  \  <i<N) 
is  a  a-algebra  on  R* . 

Proof.  Choosing  A\  ~  z=  An  =  es  implies  that  0  €  y.  Let  /I  be  an 
element  of  y .  Then  .<4  =  (Ft  n  U  •  •  •  U  IJSn  C\An)  for  some  choice  of  the 
Ai ’s  from  .  Further,  .4^  =  (Fi  n  /4i  n  ■  •  •  fl  (Fat  n  An)^  •  Since 


j(g^ 


Otrit  O'rdCci 

oUfAre 
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Hence  is  a  finite  union  of  sets,  each  of  which  is  of  one  of  the  following 
ttu'ce  forms: 


(;)  Sm  n-  •  nJ?  where  I  <ni<  -<n)(<N,k>l,  and  B  e  ; 
(ii)  5^  n  R  for  I  <  j  <  iV  and  5  €  ^(R*) ; 

(m)  B  €  ^(R'')  . 

— '  Every  set  of  the  form  given  by  (i)  is  empty  since  the  St ’s  arc  disjomU  Further, 
any  set  5  €  £SiJSi^)  may  be  expressed  as  R  =  {5i  n5)  u  •  •  •  U  [Sn  n  B) .  Hence, 
is  an  clement  of  . 

Finally,  if  Bi,  Bj, ...  are  in  y,  then  for  some  choice  of  the  Atj's  from 


Do 


O^i  =  U  Ay)  =  n-Jyn  (ij Ay)  tj, 

1=1  /=!  y=i  y=i  / 


Recall  that  two  measures  Pi  and  Pi  on  a  given  measurable  space  (fi ,  S') 
are  said  to  be  equivalent  if  {A  &  :  Pi(A)  =  0}  =  {A  ^ :  PiiA)  =  0} .  Notice 

that  for  sets  Si,Si,  ...,Sn  ^  above,  it  follows  that,  for  any  positive  integer 
i  <  N  and  any  ,^(R*)-measurabIe  set  H,  P,(StnH)  =0,  P.{Sf  nH)  =  0, 

P'{Si  n  //)  =  F(//) ,  and  n  //)  =  P{H)  for  any  probability  measure  P 
on  (R* ,  ^(R*'))  that  is  equivalent  to  Lebesgue  measure  on  (R*' ,  ,^(R*)) .  The  ’ 
following  lemma  will  be  used  in  the  proof  of  a  subsequent  theorem. 

Lemma  3.  For  a  positive  integer  k  and  an  integer  N  >  \  ,  let  S\,  Si, ... ,  Ss 
comprise  a  partition  of  R^  such  that  for  each  positive  integer  j<N,  Sj  is  a  sat-  ^ 

uratecCnm^-measurable  set.  Let  P  be  a  probability  measure  on  (R*,.^(R*)) 
that  is  equivalent  to  Lebesgue  measure  on  (R* ,  .^(R*)) .  Z^t  Ai, ... ,  A  ff  and 
Bi,... ,  Bn  be  sets  from  .^(R*)  such  that 

{Si  nAi)\j---\j{SNr\AN)  =  iSir\Bi)\j---v{SNnBN). 


Then  P{Ai£sBi)  =  0  for  any  positive  integer  i  <N  where  for  any  two  subsets 
A  and  B  ofIL’‘,  AAB  denotes  the  symmetric  difference  of  A  and  B . 

Proof.  Fix  a  positive  integer  i  <N.  By  assumption, 

{Si  nAi)ij---\j{SNnAN)  =  {Si  nBi)tj---\j{SNriBN). 

Intersecting  each  side  with  Si  implies  that  {Si  nAi)  =  (5,  n  J?,);  which  implies 
that  (5in^)n(5inR,)‘^  =  (5in.4,)n(5f  uBf)  =  (5in^n5f)u(5(n^nfif)  = 
(S'!  n  Ai  n  Bf)  =  0  and,  similarly,  that  (5.  n  Bi  n  .«4f)  =  0 .  Thus,  we  see  that 
{Si  nAinBf)u  {Si  nBin  A1)  =  Sin  {AiABi)  =  0 .  since  {AiABt)  e  ^(R*) ,  it 
foUows  that  P{AiLiBi)  =  P'{Si  n  (^AR,))  =  P*(0)  =  0 .  Q.E.D. 

The  following  theorem  provides  a  probability  space  upon  which  the  principal 
result  of  this  paper  will  be  based. 


Theorem  1.  For  a  positive  integer  k  and  an  integer  N>  I  .let  Si ,  Si, ... ,  Sn 
comprise  a  partition  of  R*  such  that  for  each  positive  integer  j  <N,  Sj  is  a  sat¬ 
urated  non- K-measur able  set.  Let  P  be  a  probability  measure  on  (R* ,  .^(R*)) 
that  is  equivalent  to  Lebesgue  measure  on  (R* ,  .^(R*)) ,  There  exists  a  prob¬ 
ability  space  (R*,  p)  such  that  S'  includes  ^(R*),  such  that  S'  contains 
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•Si . Sn.  such  that  the  measure  n  agrees  with  P  on  ^(R*^) ,  and  such  that 

^(R*)  is  independent  of  a  ....  Sn)  . 

Proof.  Let  ^  be  the  ir-algebra  ^  provided  by  Lemma  2.  Recall  that  SP 
contains  all  sets  of  the  form  ^  n  ^i)  U  ■  •  •  u  OS,  n  v4„)  where  Ai  6  ^(R*)  for 
each  positive  integer  i  <  N A  6  SS(R^)  then  choosing  A-i  =  ■ .  ■  =  An  -  A 
implies  that  A^^ .  Similarly^  for  any  positive  mteger  i  <N ,  setting  At  =  R* 
and  all  other  Afi  equal  to  the  empty  set  implies  that  Si  .  Define  a  measure 
p  on  the  measurable  space  (R* ,  via 

/^((•Si  n ^,)  U  • .  •  U  (5// n =  ~(P(^,)  + . . •+ 

for  (5in^i)U- •  •U(*Si/n>4//)  That  p  is  well  defined  follows  from  Lemma 
3  and  that  p  is  in  f^act  a  probability  measure  that  agrees  with  P  on  .^(R*)  is 
then  straightforward.  Further  notice  that  p{Si)  =/^ for  each  positive  integer 
i  <  N  and  that,  for  any  set  B  e  and  any  positive  integer  i  <  N , 

p{Si  nB)  —  jfP(B)  =  p(Si)p{B) .  Thus  S{  is  independent  of  .^(R^)  for  each 
positive  integer  i  <  N .  Finally,  notice  that  .^(R*)  is  in  fact  independent  of 
a(5'i , . . . ,  Sn)  since  {0 ,  , , . . ,  Sn)  is  a  jf-system.  Q.E.D. 

We  arc  now  in  a  position  to  state  and  prove  the  principal  result  of  this  paper. 

Theorem  2.  Let  k  he  a  positive  integer  and  let  N  be  an  integer  greater  than  • 
one.  There  exists  a  probability  space  ,v),  a  Gaussian  random  vector 

X  defined  on  {Cl',  ,v)  taking  values  in  R*  with  a  positive  definite  covari¬ 
ance  matrix,  and  an  N-level  k-dimensional  quantizer  -*  F  such  that 

u{Q{X)  -  x)  =  \/N  for  each  x  in  F  and  such  that  X  and  Q{X)  are  inde¬ 
pendent. 

Proof.  Let  S\, ,  Sn  sets  as  provided  by  Corollary  2.  For  these  N  subsets 
of  R* ,  let  {n,  u)  be  a  probability  space  as  provided  by  Theorem  1  where 
P  is  chosen  to  be  the  product  measure  induced  by  placing  standard  Gaussian 
measure  on  each  factor  of  (R*.  ^(R*)).  For  each  positive  integer  i  <  N, 
let  at  be  an  element  from  Si.  Let  F"  denote  the  set  {oti , ,  qw) •  Define 
an  Mevel  ^-dimensional  quantizer  g;  R*  -*  F  via  Q{x)  =  ai/s,(x) . 
Further,  notice  that  the  random  vector  X((o)  -  ca;  o)  e  D,  is  a  zero  mean 
Gaussian  random  vector  defined  on  {Cl,  S^,u)  whose  covariance  matrix  is  the 
k  X  k  identity  matrix.  Also,  notice  that  for  I  <  i  <  N,  u{Q{X{qj))  =  a/)  = 
v{q)  e  Si)  =  l/JV.  Finally,  notice  that  X  and  Q{X)  arc  independent  via 
Theorem  1.  Q.E.D. 
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Summary  of  Corrections  to 
PROC1451  (Hall  and  Wise) 


1.  In  the  sixth  line  of  the  introduction  on  page  1,  references  8  and  10  should  be  changed  to 
become  references  7  and  9. 

2.  In  the  sixth  line  of  the  development  on  page  1,  a  left  parenthesis  should  be  insened 
before  R^. 

3.  The  following  foomote  should  be  added  to  the  bottom  of  page  1: 

“A  preliminary  version  of  this  paper  was  presented  at  the  863rd  meeting  of  the  American 
Mathematical  Society.” 

4.  Reference  number  9  in  the  line  above  Lemma  1  on  page  2  should  be  changed  to  become 
Reference  number  8. 

5.  The  lowercase  m  m  2^  in  the  statement  of  Lemma  1  on  page  2  should  be  an  uppercase 
M  as  you  have  already  indicated. 

6.  The  second  phrase  “such  that”  in  the  statement  of  Lemma  1  on  page  2  should  be  deleted 
as  you  have  already  indicated. 

*  7.  The  phrase  “a  set  B  €  CB(R^)”  in  the  fourth  line  of  the  proof  of  Corollary  2  on  page  2 

should  be  deleted  and  replaced  instead  by  the  phrase  “an  subset  B  of  R^  Note: 

is  an  uppercase  script  F  followed  by  a  subscripted  lowercase  Greek  letter  sigma. 

*  8.  The  phrase  “Recall  from  [7,  p.  161]”  in  the  7th  line  of  the  proof  of  Corollary  2  on  page 
2  should  be  replaced  by  the  phrase  “Note”. 

9.  The  uppercase  J  in  the  second  line  of  the  statement  of  Lemma  2  on  page  2  should  be 
replaced  by  a  lowercase  j. 


*  Items  7  and  8  correct  a  small  but  critical  oversight  in  the  proof  of  Corollary  2  that  slipped 
by  both  the  authors  and  the  reviewers  and  was  not  noticed  until  the  proofreading  stage.  If 
you  have  any  questions  regarding  these  two  changes  please  don’t  hesitate  to  contact  either 
author.  (Eric  Hall  at  (2 14)- 692-4367  or  Gary  Wise  at  (5 12)-47 1-3356.) 
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10.  Extra  space  should  be  inserted  between  the  ^  and  the  in  the  sixth  line  of 

i  =  1  j=l  vH) 

the  proof  of  Lemina  2  at  the  bottom  of  page  2. 

11.  The  sentence  beginning  with  the  word  “Every”  near  the  top  of  page  3  should  not  be 
indented. 

12.  The  hyphen  following  the  word  “non”  in  the  third  line  of  Lemma  3  should  be  moved 
over  slightly  as  you  have  already  indicated. 

13.  If  word  “if’  in  the  third  line  of  the  proof  of  Theorem  1  on  page  4  should  be  capitalized. 

14.  The  word  “optimal”  in  reference  2  on  page  4  should  be  “optima”. 

15.  The  name  “L.  Linde”  in  reference  5  on  page  4  should  be  “Y.  Linde”. 

16.  The  phrase  “k  means”  in  reference  6  on  page  5  should  be  “k-means”. 

17.  Reference  7  on  page  5  should  be  deleted,  entirely. 

18.  Reference  8  on  page  5  should  be  renumbered  to  become  reference  7  and  the  journal 
name  “IEEE  Trans.  Inform.  Theory”  should  be  added  as  you  have  already  indicated. 

19.  Reference  9  on  page  5  should  be  renumbered  to  become  reference  8. 

20.  Reference  10  on  page  5  should  be  renumbered  to  become  reference  9. 

21.  The  phrase  “Electrical  Computer’’  should  be  “Electrical  and  Computer”  in  the  second 
address  on  page  5  as  you  have  already  indicated. 

22.  The  phrase  “University  of  Texas”  should  be  “The  University  of  Texas”  in  the  second 
address  on  page  5. 

23.  The  following  email  addresses  should  be  included  the  addresses: 
author  1:  ebh@smunews.smu.edu 

author  2:  gwise@ccwf.cc.utexas.edu 


E.  Multidimensional  Convolution 


Results  on  multidimensional  convolution  are  presented  in  the 
paper  entitled  "Some  Aspects  of  Multidimensional  Convolution” 
which  appeared  in  Proceedings  of  the  1991  IEEE  International 
Conference  on  Acoustics,  Speech,  and  Signal  Processing  and  is 
given  in  Appendix  E.  It  is  shown  that  multidimensional  convolution 
need  not  be  associative  .  Further,  for  any  positive  integer  k  ,  it  is 
shown  that  the  multidimensional  convolution  of  two  real  valued, 

bounded  integrable  nowhere  zero  functions  defined  on  R*'  can  be 
identically  equal  to  zero.  These  results  are  discusses  in  an 
algebraic  setting,  and  a  consequence  involving  random  fields  is 
briefly  discussed. 


SOME  ASPECTS  OF  MULTIDIMENSIONAL  CONVOLUTION 


Eric  B.  Hall 
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Abstract 

It  is  shown  that  multidimensional  convolution  need  not 
be  associative.  Funher,  for  any  positive  integer  k,  it  is  shown 
that  the  multidimensional  convolution  of  two  real  valued, 
bouixled,  integrable,  nowhere  zero  functions  defined  on 
can  be  identically  equal  to  zero.  These  results  are  discussed  in 
an  algebraic  setting,  and  a  consequence  involving  random 
fields  is  briefly  considered. 

Introduction 

Real  valued  functions  of  several  variables  frequently 
occur  in  such  areas  of  signal  processing  as  image  processing, 
optics,  and  oceanography.  In  these  areas,  as  well  as  in  many 
others,  convolution  plays  a  major  role.  This  paper  treats 
several  aspects  of  multidimensional  convolution  which  should 
be  of  interest  to  the  signal  processing  community. 

In  applications,  multidimensional  convolution  often 
arises  when  considering  multidimensional  linear  systems.  In 
this  context,  the  linear  system  is  characterized  via  convolution 
with  an  integrable  function  g  and  the  input  to  the  linear  system 
is  denoted  an  integrable  function  f.  The  function  h  given  by 
f  convolved  with  g  then  denotes  the  resulting  output  A 
problem  which  frequently  arises  in  system  identificaaon  is  that 
of  deconvolution  which  is  concerned  with  approximating  or 
identifying  the  function  g  from  a  knowledge  of  the  pair  of 
functions  f  and  h. 

In  [1]  it  was  shown  that  in  a  one-dimensional  setting 
there  exist  integrable,  bounded,  nowhere  zero  functions  f  and 
g  such  that  f  convolved  with  g  is  identically  equal  to  zero.  That 
is,  in  the  context  of  linear  systems,  there  exists  a  linear  system 
described  via  convolution  with  a  fixed,  bounded,  nowhere 
zero  function  g  which  may  be  a  nopass  filter  to  an  input  which 
is  nowhere  zero.  Clearly,  such  a  phenomenon  should  be  a 
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cause  of  some  concern  to  one  who  is  attempting  to  derive  a 
general  method  of  deconvolution.  In  this  paper,  we  extend  this 
result  to  the  case  of  multidimensional  convdution. 

Deveiopment 

Let  k  be  a  positive  integer.  For  a  set  S  c  R^,  we  will  let 
I3  denote  the  indicator  function  of  the  set  S.  We  will  denote 
by  L|(R^)  the  set  of  all  extended  real  valued  Lebesguc 
integrable  functions  modulo  almost  everywhere  equivalence 
defined  on  R*^  equipped  with  the  norm  given  by  the  integral  of 
the  absolute  value  of  an  element  of  LjORI^).  By  a  k-sequence 
of  real  numbers  we  will  mean  any  function  mapping  into  R 
where  Z  denotes  the  integers,  and  we  will  denote  the  value  of 
such  a  function  a  at  the  point  x  via  a^.  A  k-sequence  will  be 
called  absolutely  summable  if  it  is  integrable  with  respea  to 
counting  measure  on  the  power  set  of  ZK  Further,  we  will 

occasionally  denote  points  x  in  R^^  as  x  =  (x  j ,  X2 . x^) 

where  the  xj's  are  real  numbers.  Finally,  for  two  points  x  and 
y  in  RI^  we  will  denote  the  Euclidean  inner  produa  via 

k 

<*.y>=  X  *iyj- 

j  =  1 

Recall  that  the  convolution  of  two  functions  f  and  g  in 
LjCRI^),  denoted  by  f  •  g.  is  defined  via 

(f  *  g)(x)  =  /  f(x  -  y)  g(y)  dy 

provided  that  this  integral  exists  for  all  x  e  R^^.  Further,  we 
recall  [2,  pp.  247-2481  that  if  f  and  g  are  in  LjfR*')  then  f  *  g 
is  also  in  L](R*^)  and  satisfies  Ilf  *  gllj^^  S  HAIlj  Ugl^Lj- 

The  following  lemma  shows  tiiat  multidimensional 
convolution  need  not  be  associative. 


Presented  at  the  1991  IEEE  International  Conference  on  Acoustics,  Speech,  and  Signal  Processing, 
May  h'-J7, 1991;  to  be  published  in  the  Proceedings  of  the  Conference. 
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Leimna  1:  Let  k  be  a  positive  integer.  There  exist  three 
bounded  real  valued  Lebesgue  measurable  functions  f,  g,  and 
h  defined  tm  such  that,  even  though  the  convolutions  are 
each  defined,  f  •  (g  •  h)  »•  (f  *  g)  *  h,  i.c.  such  that 
convolution  is  not  associative. 


Proof:  Consider  first  the  special  case  when  k  =  1.  As  in 

[3,  p.  177J,  for  t  and  x  real,  define 

p(t)  =  (1  -  cos(t))  Ijo,  2jt](*) 
x 

h(x)  =  J  p{t)  dt.  Note  that  (f  •  g)(x)  =  J f(x  - 1)  g(t)  dt 

R 

=  Jp’(t)  dt  =  p(2tt)  -  p(0)  =  0.  Further,  (g  *  h)(x) 

R 

t 

=  Jg(*  - 1)  h(t)  dt  =  I  p’(x  -  0  J  p(s)  ds  dt  =  (p  •  p)(x)  via 
R  R 

integration  by  parts.  Note  that 
2ji 

(p  •  p)(x)  =  j  (1  -  cos(x  -  y))  (1  -  cos(y))  I(x  _  27t,  x](y)  dy- 
0 

Hence,  (g  •  h)(x)  is  positive  on  (0, 4n)  and  zero  elsewhere. 
Finally,  even  though  (f  •  g)  *  h  =  0,  we  see  that  f  *  (g  •  h)  is  a 
positive  constant. 

Now,  let  k  be  an  integer  greater  than  1 .  With  f,  g,  and  h 
defined  as  in  the  preceding  paragraph,  let  f,  g,  and  h  map  R*^ 
into  R  via  f(x)  =  1,  g(x)  =  g(xi)  g(x2) . . .  g(xjt),  and  h(x)  = 
h(xi)  h(x2) . . .  h(xk).  It  follows  immediately  that  (f*^  =  0 
and  (g  *  h)  is  positive  on  (0, 4jt)l^  and  zero  elsewhere.  Hence, 
(?  *  g)  *  h  =  0  but  f  *  (g  ♦  h)  is  a  positive  constant. 

Q.E.D. 


Next,  consider  two  bounded,  real  valued,  Lebesgue 
integrable  functions  f  and  g  defined  on  R*'.  Further,  assume 
that  f  and  g  arc  nowhere  zero.  Docs  it  follow  that  f  •  g  is 
nowhere  zero?  Docs  it  follow  that  f  *  g  is  nonzero  on  some 
nonempty  set?  From  a  linear  systems  viewpoint,  does  a 
nowhere  zero  input  to  a  linear  time-invariant  system  described 
via  convolution  with  a  fixed  nowhere  zero  function  result  in  an 
output  which  is  nonzero  somewhere? 

To  begin,  we  will  need  the  following  notation.  For  an 
absolutely  summable  k-sequence  of  real  numbers  a,  define  a 
bounded  linear  operator  on  L](R*')  to  Lj(R*')  via 


(To(0)(x)=  |ayf(x-y)dC(y), 

Zk 

where  C  denotes  counting  measure  on  the  power  set  of  Z^,  for 
any  element  f  from  Lj(R'‘).  Fot  any  two  absolutely 
summable  k-sequences  of  real  numbers  a  and  p,  it  follows 
that 

((Ta‘>Tp)(n)(x)=Ta(  f(x-y)  dC(y)) 

Zk 

=  1  J  azPyf(x-y-z)dC(y)dC(z) 
z^zk 

=  jxyf(x-y)dC(y) 

Zk 

where  we  define 

Xy=  J  J  Op Pq gy(p, q) dC(p) dC(q), 

ZkZk 

where  gy(p,  q)  equals  one  if  p  +  q  =  y  and  equals  zero 
otherwise.  Finally,  note  that  for  any  two  elements  f  and  g 
from  LjfRk)  it  follows  via  Fubini’s  theorem  that 

(Tot(O)  •  (Tp(g))  =  (Tq  0  Tp)  (f  .  g). 

Theorem  1:  Letting  the  above  set  notation,  there  exist  two 
non-identically  ztro  absolutely  summable  k-sequences  of  real 
numbers  a  and  p  such  that  for  any  f  and  g  from  Li(Rk), 

(Ttx(0)*(Tp{g))  =  0. 

Proof:  Recall  that  the  function  lcos(xi)  cos(x2)  ■  ■  ■  cos(xj[)l 
is  expressible  as  a  multiple  Fourier  series  given  by 

j  Cy  cxp(i  (x.  y))  dC(y) 

Zk 

where  it  follows  easily  that  Cy  =  ayj  ay2  ■  ■  •  where  an  =  0 
if  n  is  odd  and 

.  1  -n2. 
if  n  is  even.  Further,  if  we  define 
fl(x)  =  ^  (lcos(xi)  cos(x2)  ■  •  •  cos{xjj)l  + 

(cos(xi)  cos(x2)  ■  ■  •  cos(xk)))  and 
f2(x)  =  ^  (lcos(xi)  cos(x2)  •  •  •  cos(xk)l  - 
(cos(xi)cos(x2)  •  •  ■  cos(xk))),  then  fj(x)  f2(x)  =  0. 

f  j(x)  =  J  Oy  exp(i  {x,  y»  dC(y). 

Zk 


and 
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fzM  =  J  Py  «pu  y»  dC(y) 
zk 

where 

ify«  {-1,1}^ 

^  IjJ-f  ifye 

and 

ify«  {-l.l}k 

Bui,  fi(x)  f2(x)  =  0 

=  J  Oy  exp(i  <x,  y»  dC(y)  J  Py  exp(i  <x.  y))  dC(y) 

Zk  Zk 

=  J  J  Oy  P2  cxp(i  «x,  y)  +  <x,  2»)  dC(y)  dC(z) 
Z^Zk 

exp(i  <x,  y))  dC(y)  where,  as  before, 

Zk 

S' '  1  J  ®p  ®y^P’ 

Zkzk 

and  gy(p,  q)  equals  one  if  p  +  q  =  y  and  equals  zero  other¬ 
wise.  Note  that,  via  Fubini’s  theorem,  Xy  =  0  for  every  y. 
Thus,  it  follows  that  (T„(f))  ♦  (Tp(g))  =  (T^  o  Tp)  (f  *  g)  =  0 
for  any  integrabie  f  and  g.  Q.E.D. 

Scholium  1:  Let  k  be  a  positive  integer.  There  exist  two  real 
valued,  bounded,  nowhere  zero,  Lebesgue  integrabie 
functions  defined  on  Rk  such  that  their  convolution  is 
idendcally  equal  to  zero. 

Proof:  In  the  proof  of  Theorem  1,  choose  f(x)  =  g{x)  = 
lS(x)  where  S  =  (-1,  l]k  Q.E.D, 

Before  commenting  further  upon  this  result,  we  shall 
detour  fre  a  moment  to  review  a  few  algebraic  concepts. 

Recall  that  a  nonempty  set  G  and  two  (^.ations  +  and  ■  form 
an  associative  ring,  hereafter  referred  to  as  a  ring,  if  G  is  an 
Abelian  group  with  respect  to  the  -t-  operatiwt  (denoting  the 
identity  element  by  0  and  (a)”^  by  -a),  G  is  closed  and 
associative  with  respect  to  • ,  and  finally  if,  for  any  a,  b,  and  c 
inG,  a*(b  +  c)  =  a'b-fa-c  and  (b  +  c)  •  a  =  b  •  a  -t  c  •  a. 


Further,  if  there  exists  an  element  u  in  G  such  that  a  *  u  u  ■ 
a  =  a  for  every  a  in  G.  then  G  is  said  to  possess  a  unit  element. 
Also,  if  a  *  b  =  b  *  a  for  every  a  and  b  in  G  then  C  is  said  to  be 
a  commutative  ring.  Recall  that  for  a  cootmutative  ring  G  an 
element  a  *  0  in  G  is  said  to  be  a  zero-divisor  if  there  exists  an 
element  b  ^  0  in  G  such  that  a  ■  b  0.  Further,  recall  that  a 
conunutadve  ring  is  said  to  be  an  integral  domain  if  it 
possesses  no  zero-divisors.  For  a  more  complete  discussion 
of  rings,  the  interested  reader  is  referred  to  [4], 

It  follows  easily  that  L|(Rk)  equipped  with  the 
operadons  of  pointwisc  addidon  and  convoludon  is  a 
commutadvc  ring  3?,;  in  fact,  3^  •  commutadve  Banach 
algebra.  Even  though  it  can  be  seen  that  this  ring  possesses  no 
unit  element  [2,  p.  248],  it  does  possess  so-called 
“approximate  units”  which  often  serve  just  as  well  for  many 
purposes. 

The  previous  results  can  now  be  viewed  in  a  different 
setting.  Recall  that  Lemma  1  showed  that  multidimensional 
convolution  need  not  be  associative.  Although  this  result  may 
seem  surprising  to  some,  notice  that  the  function  f  given  in  the 
proof  of  Lemma  1  is  not  an  clement  of  Further,  from  an 
algebraic  standpoint,  the  perhaps  disturbing  result  of  Scholium 
1  yields  the  following  corollary  as  a  direct  consequence. 

Corollary  1:  Let  k  be  a  positive  integer.  The  commutative 
ring  given  by  LjfRk)  equipped  with  the  operations  of 
pointwise  addition  and  convolution  is  not  an  integral  domain. 

Hence,  (ItotDllary  1  implies  that  (f  •  g)  =  0  can  occur 
even  when  neither  f  nor  g  is  equal  to  zero.  In  fact,  we  have 
actually  shown  something  stronger  via  Scholium  1  since 
it  exhibits  bounded  integrabie  functions  f  and  g  defined  on  Rk 
which  are  nowhere  equal  to  zero  and  yet  for  which  (f  •  g)  is 
identically  equal  to  zero. 

Finally,  again  let  k  be  a  positive  integer.  It  follows  from 
Theorem  1  and  Scholium  I  that  there  exists  a  random  field 
(X(p):  p  E  Rk  j  with  integrabie  sample  paths  and  a  function 
f:Rk— >R  which  is  Lebesgue  integrabie  and  nowhere  zero 
such  that  f  *  X  is  identically  zero.  Such  a  result  should  be  of 
interest  to  those  in  areas  such  as  seismology,  radio  astronomy, 
underwater  acoustics,  and  channel  equalization  where  blind 
deconvolution  techniques  arc  frequently  employed. 


-4- 


Conciusion 

In  this  paper  we  have  considered  nuilddimcnsional 
convcduDon  from  an  algebraic  standpoint  and  presented  a 
Tcsuii  which  may  be  of  interest  to  the  engineering  community. 
In  particular,  we  showed  that,  for  any  positive  integer  k.  the 
convolutkm  of  two  nowhere  zero,  bounded,  integrable,  real 
valued  functions  defined  on  may  be  everywhere'aero. 

This  result  should  be  of  interest  to  those  attempting  to  identify 
die  input  to  a  linear  time-invariant  system  via  some  operations 
on  the  output,  such  as  in  deconvolution  problems. 
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F.  The  Concept  of  Finite  Memory  of  a  Stochastic  Process  or  of  a 
Random  Field. 

Results  on  the  concept  of  finite  memory  of  a  stochastic  process 
or  of  a  random  field  are  presented  in  the  paper  entitled  "A  comment 
on  finite  memory  of  stochastic  processes"  which  appeared  in  the 
September  1992  issue  of  the  IEEE  Transactions  on  Signal 
Processing  and  is  given  in  Appendix  F.  It  is  shown  that  a  recently 
proposed  concept  of  finite  memory  for  a  zero  mean  strictly 
stationary  stochastic  process  results  in  a  stochastic  of  random 
variables  each  of  which  is  almost  surely  equal  to  zero.  We  eagerly 
note  that  the  earlier  work  about  which  this  paper  comments  was 
work  supported  by  the  Office  of  Naval  Research. 
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ABSTRACT 

It  is  shown  that  a  recently  proposed  concept  of  finite  memory  for  a  zero  mean  strictly 
stationary  stochastic  process  results  in  a  stochastic  process  of  random  variables  each  of 
which  is  almost  surely  equal  to  zero. 


DEVELOPMENT 


Let  k  be  a  positive  integer  and  let  {X(t):  t  e  R)  be  a  zero  mean,  strictly  stationary 
stochastic  process  defined  on  some  probability  space  and  taking  values  in  In  [1] such 
a  stochastic  process  is  said  to  have  finite  memory  if  there  exists  a  positive  real  number  D 
such  that  for  any  positive  integer  n  and  for  any  n  times  ^  t2’  •  ■  ’ 
random  variables  {X(t]),  X(t2). .  • . ,  X(t^)}  and  {X(t]  +  d),  X(t2  +  d), . . . ,  X(tf,  +  d)) 
are  statistically  independent  for  any  real  number  d  such  that  d  >  D.  Here  we  note  that  such 
a  stochastic  process  is  degenerate  in  the  sense  that  any  random  variable  in  the  stochastic 
process  is  almost  surely  equal  to  zero. 


First,  consider  the  situation  of  a  finite  memory,  zero  mean,  strictly  stationary  stochastic 
pnxess  as  above.  Let  0  denote  the  origin  of  let  n  =  2;  let  tj  =  -2D;  and  let  t2  =  0.  In 
this  case  note  that  the  set  of  random  variables  { X(-2D)  and  X(0) }  and  the  set  of  random 
variables  {X(-2D  +  d)  and  X(d)}  must  be  statistically  independent  for  any  d  >  D.  If  we 
choose  d  =  2D,  then  we  see  that  X(0)  must  be  independent  from  itself,  and  hence,  since 
E[X(0)]  =  0,  we  see  that  X(0)  =  0  a.s.  Now,  since  the  stochastic  process  is  strictly 
stationary,  it  follows  that  for  any  real  number  t,  X(t)  =  0  a.s.  Hence,  for  each  real  number 
t,  each  component  of  the  random  vector  X(t)  is  almost  surely  zero. 
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G.  Distribution  of  the  Determinant  of  a  Random  Matrix 

Results  on  the  distribution  of  the  determinant  of  a  random  matrix 
are  presented  in  the  paper  entitled  "A  note  on  the  distribution  of  the 
determinant  of  a  random  matrix"  which  appeared  in  the  February 
1991  issue  of  Statistics  and  Probability  Letters  and  is  given  in 
Appendix  G.  An  analysis  of  the  tail  behavior  of  a  probability  density 
function  of  the  determinant  of  a  random  matrix  is  presented,  and  an 
oversight  in  an  earlier  paper  on  this  subject  is  noted. 
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Afistract  A.n  analysis  of  a  probabilny  densuy  funclion  of  the  determmanl  of  a  random  malrn  is  presenied.  and  an  over>.igbi  in  an 
earlier  paper  on  this  subject  is  noted, 

Kevv.ortls.  Determent,  random  matrix. 


Let  /<,.  A^.  Ay  and  A 4  denote  four  mutually 
independent  identically  distributed  random  varia¬ 
bles  defined  on  the  same  probability  space  and 
uniformly  distributed  over  [0,  1}.  Denote  by  M  the 
following  matrix: 


In  Williamson  and  Downs  (1988),  a  graph  was 
presented  for  a  probability  density  funclion  (pdf) 
for  the  determinant  of  M.  In  this  paper  we  show 
that  this  graph  provides  a  misleading  representa¬ 
tion  for  such  a  pdf. 

It  follows  straightforwardly  that  the  random 
variable /!  I /4 4  has  a  pdf  given  by  -Xi«.i)(-*^)  log(x) 
(all  logarithms  in  our  paper  are  Naperian  loga- 
nthms).  Let  X  and  Y  be  independent  random 
variables  defined  on  a  common  probability  space, 
each  having  pdf  -x,o.i,(^)  Further,  let 

W  =  X  ~  Y.  Notice  that  the  distribution  of  W  i,s 


This  research  was  partially  supported  by  the  Office  of  Nasal 
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the  same  as  the  distribution  of  the  determinant  of 
M.  Also,  notice  that  there  exists  a  pdf  for  If 
which  is  even  and  which  is  supported  on  (-1,  I], 
For  .y  e  (0,  1)  we  have  that  a  pdf  of  ff  at  sa\ 
p{x).  is  given  by 


dtt 


f  \  -  \ 

p\x)-  f  login  -v-  ,v)  log!  H  )  dn 
Using  integraijon  by  parts,  we  get 

Now,  upon  simplification  we  get 
p(x)  =  {1  -  .v)[2  -  log(l  -  .\  )] 

,  /  >  ,  f'-'  log!  K  )  , 

A-  X  Iog(  -V  )  +  .V  f  — — —  d  «  , 

Jo  «  +  -v 

For  a  fixed  positive  number  a.  let  )i;(0,  1)  —  R 
via 

/t(H  )  ~  log(  W  )  log(  1  +  Vl/o  ) 

/•'»  log{  1  -  / ) 


/ 


d/. 
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Note  that  t  =  log(ve)/(w  +  a).  Thus  we  see 
that 

^  =  -log(l  -  x)  log(x} 

■'i-i.i  ' 

and  therefore  we  get  that 
/>(  V)  =  (1  -  ,v)(2  -  logo  -X)] 

+  V  log(  X  )  -  A  log{  1  -  .A  >  '  .  ) 

*«/"  jail-  '1 

It  follows  siraightforwa  jjy  that 

n"( =  -liSlLLrji  _  *°g( ^ )  ,  logi -y ) 

-v  A(1  -  a)  1  -  A  ’ 


fehruarv 

and  hence  we  see  that  p''{x)  >  0,  Thus,  we  note 
that,  restricted  to  (0,  1 ),  ■ )  is  convex.  This  shows 

that  the  graph  given  in  Fig.  1  of  Williamson  and 
Downs  (1988)  is  misleading  as  a  representation  for 
a  df  of  the  determinant  of  M.  and  it  points  out 
a-  mportant.  yet  often  unheeded,  caveat  associ¬ 
ated  with  truncation  effects  in  numerical  schemes. 
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H.  Stationary  Random  Processes 


Results  on  stationary  random  processes  are  presented  in  the 
paper  entitled  "A  Cautionary  Aspect  of  Stationary  Random 
Processes"  which  appeared  in  the  Novemoer  1991  issue  of  IEEE 
Transactions  on  Circuits  and  Systemsand  is  given  in  Appendix  H.  A 
problem  associated  with  determining  the  stationarity  of  a  random 
process  from  discrete  time  samples  is  noted.  In  particular,  a 
nonstationary  Gaussian  random  process  {X{t);  t  e  R}is  given  such 
that  for  any  positive  real  number  A,  the  discrete  time  random 
process  {X(nA):  n  e  Z}  is  strictly  stationary,  where  Z  denotes  the  set 
of  all  integers. 
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A  Cautionary  Aspect  of  Stationary  Random 
Processes 

Gary  L.  Wise 


Abstract  — A  problem  associated  with  dctermiiiing  the  stationarity  of  a 
random  process  from  discrete  time  samples  is  noted. 

Development 

Let  [XUy.  t  ^  R]  be  some  random  process.  Let  A  be  a 
positive  real  number,  and  consider  the  random  process  {A'(nA); 
n  e  Z),  where  Z  denotes  the  set  of  integers.  In  many  practical 
problems,  one  would  be  interested  in  knowing  whether  or  not 
the  ranJom  process  {A'(i):  r  e  if)  is  stationary.  However,  due  to 
the  current  digital  trend  in  signal  processing,  one  might  attempt 
to  determine  the  stationarity  of  (AfnA):  n  e  Z),  What  if  { A'(/t A): 
n  e  Z)  were  stationary  for  any  positive  real  number  A?  Would 
this  imply  stationarity  of  {X(t):  t  e  R]?  We  show  by  an  example 
that  the  answer  to  this  second  question  is  no. 

Let  {y(r):  i  e  be  a  stationary  zero  mean  Gaussian  random 
process  defined  on  some  underlying  probability  space,  such  that 
E[Y(t)  K(/ +  t)]  =  Define  the  stationary  zero  mean 

Gaussian  random  process  {Z(f):  re/?)  via  Z(t)=  y(2t).  Now, 
define  a  zero  mean  Gaussian  random  process  {Afr):  t  e  /f)  via 
A’(i)=y(f)  if  i  is  rational,  and  A'(f)=Z(l)  if  t  is  inational. 
Observe  that  {A’(f):  r  g  is  not  stationary  since  if  i  and  t  are 
rational,  then  El  XU)  X(t  +  t)]=  e"'’’',  yet  if  t  is  irrational  and 
T  is  rational,  then  ElX(i)  A(f  +  t)1=  Next,  pick  any 

positive  real  number  A.  Note  that  if  A  is  rational,  then  nA  is 
rational  for  all  integers  n.  Also,  if  A  is  irrational,  then  nA  is 
irrational  for  all  nonzero  integers  n.  Further,  7(0)  =  Z(0)  =  XiO). 
Hence,  for  all  integers  n,  if  A  is  rational,  then  A'(nA)=  7(nA). 
and  if  A  is  irrational,  then  A'(nA)=  Z(rtA).  Thus  for  any 
positive  real  number  A,  {A'(nA);  n  €  Z)  is  a  stationary  Gaussian 
random  process,  yet  {Aft):  /  e  R)  is  a  Gaussian  random  process 
that  is  not  stationary. 


Manuscript  received  July  12,  1991.  This  work  was  supported  by  the 
Office  of  Naval  Research  under  Grant  N000t4-90-J-1712.  This  paper 
was  recommended  by  Editor  R,  Liu. 
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J.  Martingale  characteristics  of  a  Weiner  process 

Results  on  a  martingale  characterization  of  a  Weiner  process  are 
presented  in  the  paper  entitled  "A  counterexample  to  a  martingale 
characterization  of  a  Wiener  process"  which  was  planned  to  have 
been  given  in  given  in  Appendix  J.  We  hasten  to  note  that  this 
investigator  has  recently  gone  through  the  trauma  of  having 
experienced  a  stroke.  Unfortunately,  he  lost  all  of  his 
documentation  of  this  paper  at  some  time  during  this  experience. 
However,  the  paper  should  appear  in  the  journal  Statistics  and 
Probability  Letters,  and  in  it  we  show  that  a  recently  proposed 
scheme  for  characterizing  a  Wiener  process  was  incorrect.  We 
regret  this  omission  in  this  report. 


K.  Estimation  of  a  random  variable  based  on  multidimensional  data 


Results  on  estimation  of  a  random  variable  based  on 
multidimensional  data  are  presented  in  the  paper  entitled 
"Estimation  of  a  random  variable  based  on  multidimensional  data" 
which  appeared  in  ihe  Proceedings  of  the  1992  IEEE  International 
Conference  on  Acoustics,  Speech,  and  Signal  Processing  and  is 
presented  in  Appendix  K.  Several  aspects  associated  with  the  mean 
square  estimation  of  a  second  order  random  variable  based  upon 
elemeiits  from  a  random  field  are  considered.  Throughout  the  paper, 
the  oft-neglected  role  of  the  underlying  probability  space  is 
stressed.  Numerous  examples  are  presented  that  point  out  many  of 
the  subtleties  associated  with  this  endeavor. 
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ABSTRACT 

Several  aspects  associated  with  the  mean  square 
estimation  of  a  second  order  random  variable  based  upon 
elements  frori  a  random  field  are  considered.  Throughout 
the  paper,  'he  oft-neglected  role  of  the  underlying 
probability  space  is  stressed.  Numerous  examples  are 
presented  that  point  out  many  of  thr  subtleties  associated 
with  this  endeavor. 

INTRODUCTION 

Let  N  denote  the  set  of  positive  integers  and  let  K  q 
denote  the  cardinality  of  N.  Let  k  €  N  and  let  C  be  a 

positive  number.  Let  S  =  (kc:  k  €  N),  and  let  G  =  S^.  We 
may  view  G  as  a  grid  of  points.  For  points 

t  =  (tj.  I2 . t|j)  €  G,  aivd  s  =  (Sj,  $2 . s^)  e  G,  we 

write  s  <  t  to  mean  that  s-  s  tj  for  i  =  1,  2 . k.  Note 

that  this  reladon  is  a  partial  order  on  G.  This  paper  will 
be  concerned  with  attempts  to  estimate  a  second  order 
random  variable  X  via  estimates  of  the  form 
Ep(  I  Yp:  p  S  n],  where  the  Yp's  are  random  variables 

indexed  by  G.  Notice  that  { Yp:  p  €  G)  is  a  random  field. 
Also,  note  that,  in  this  situation, 

1E[X  1  Ypt  p  <  n]:  n  e  G }  is  a  second  order  random  field. 

Let  (D,  if ,  P)  be  a  probability  space,  let  X  be  a 
second  order  random  variable  defined  on  (D.  !F ,  P),  and 
let  { Ypt  p  6  G]  be  a  random  field  defined  on  (£1,  7” ,  P). 
For  each  n  €  G,  pick  and  fix  a  version  of  EfX  I  Ypt  p  S  nj. 
For  n  €  G,  let  if  j,  =  0(Yp:  p  S  n)  and  let  = 

E[X  1 7^1  •  As  noted  above,  Mj^  is  a  second  order  random 
variable.  Further,  is  -measurable.  Finally,  note 


a.s.  via  standard  properties  of  conditional 

expecution.  Thus,  we  see  that  {Mj^;  n  €  G)  is  a  second 
order  multiparameter  martingale  with  respect  to  the 
filtration  { T  n  €  G ) .  We  remark  that  the  above 

comments  hold  where  (£2,  ,  P)  is  any  probability 

space.  Now  we  pose  the  question:  how  might  one  estimate 
the  random  variable  X  fiorr  the  data  { Yp:  p  €  G  j  so  as  to 
minimize  the  mean  square  error? 

preliminaries 

Before  proceeding,  we  will  review  some  definitions 
and  introduce  some  conventions  and  notations  which  will 
prove  useful.  We  will  let  S(R)  denote  the  family  of  Borel 
subsets  of  R.  Fix'  a  set  S,  we  will  let  P(S)  denote  the 
power  set  of  S  and  I^  denote  the  indicator  function  of  S . 

We  will  let  R  denote  the  set  of  real  numbers.  If  A  is  a 
subset  of  R.  -A  will  denote  the  set  {x  £  R-  -*  6  A) .  By  a 
standard  Gaussian  measure  we  will  mean  a  Gaussian 
measure  whose  first  moment  is  zero  and  whose  second 
moment  is  one.  For  any  measure  m  on  (R,  1B(R})  we  will 
let  m,  denote  the  inner  measure  on  (R,  P(R))  induced  by 

m  and  we  will  let  m*  denote  the  outer  measure  on 

(R,  P(R))  induced  by  m.  Recall  that  for  any  subset  A  of  R. 

m«  and  m*  are  defined  via  m*(A)  = 

sup{  m(B) :  A  D  B  £  ®(R)  )  and  m*(A)  = 

Bifl  m(B) ;  A  c  B  €  ®(R)  ).  We  will  say  that  a  subset  S  of 
the  reals  is  saturated  non -m -measurable  if  m*(S)  =  m,(S^) 

=  0.  We  will  let  k  denote  Lebesgue  measure  on  (R,  ®(R)). 

Throughout  this  paper,  we  will  let  n  and  p,  with  or 
without  subscripts,  denote  elements  of  G;  we  will  let  m. 

V,  P.  and  p  denote  measures;  we  will  let  i.  j.  and  k  denote 
positive  integers;  and  we  will  let  N  denote  an  integer 
greater  than  one. 


that  for  nj  and  n2  in  G  with  nj  S  n2.  ^(M,,^  *  ^n|l  “ 
E[E[X  I  Yp-.  p  S  n2J  I  Yp-.  p  S  njl  =  E{X  1  Yp.  p  S  nj}  = 
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The  following  result  is  developed  in  [1). 


Theorem  1:  Let  N  be  an  integer  greater  than  1.  There 
exist  N  subsets  S|,  S2.  •  • . .  of  the  real  line  that 

partition  the  reals  and  are  each  saturated  non-Lebesgue 
measurable.  Letting  Sj . be  as  above  and  letting  p 

be  a  probability  measure  on  (R.  9(R))  such  that  p  is 
equivalent  to  Lebesgue  measure  on  (R,  ^(R)},  there  exists 
a  probabiliQ'  space  (R,  Q,  P)  where  Q  is  given  by 
{(SjOB j)  u . . .  u  ®(R)  for  i=l. . . . .  N) 

and  where  P((SjnBj)u . . .  u(Sj^riBj^))  = 

^ip(Bj)-t-...  +  p(Bj^)]. 

The  following  corollary  is  an  immediate  result  of 
Theorem  1. 

Corollary  1:  Let  Theorem  1  set  notation.  The 

O-algebra  Q  includes  ®(R)  and  contains  Sj . Sjvj. 

Further,  the  probability  measure  P  agrees  with  p  on  ®(R) 
and,  for  the  probability  space  (R,  Q.  P),  iB(R)  is 
independent  of  0(Sj, ....  Sfj). 

Further,  we  recall  the  following  result  from  12], 
which  calls  into  question  the  validity  of  many  claims  in 
mean  square  estimauon  theory. 

Theorem  2:  For  any  real  number  B,  there  exists  a 
probability  space  (£2,  JT ,  p),  two  bounded  random 
variables  X  and  Y  defined  on  (£1,  iT ,  P),  and  a  function 
f:R-»R  such  that  E[(Y  -  E(Y  I  X])^l  >  B  and  yet  f(X)  =  Y 
pointwise  on  £2. 

Now.  we  present  an  observauon  which  will  be  of 
use  to  us. 

Lemma  1;  Let  the  iniroducuon  set  notation.  If 

n|  <  n2  S  . .  .  is  any  nondecreasing  infinite  sequence  of 

elements  from  G,  then  the  sequence  of  random  variables 

{E[X  I  Y_  ;  I  =  1,  2 . j]:  j  €  NJ  is  a  second  order 

"i 

martingale  with  respect  to  the  filtration  i  e  N). 

Proof:  First,  it  follows  from  Jensen's  inequality  for 
conditional  expectations  that 

{E[X  I  Y  :  i  =  1,2 . j]:  j  £  N)  ■  sequence  of  second 

i 

order  random  variables.  Also,  it  follows  from  the 
definition  of  conditional  expecution  that 

EP(  I  Y  :  i  =  1, 2 . j]  is  fTj,  .-measurable.  Finally, 

"i  J 

note  that  for  positive  integers  j  j  <  j2,  it  follows  from 
standard  properties  of  conditional  expecution  that 
ElE[XIY„,:i  =  1.2 . 

EIXI  Y^  ;i=],2 . j,Ja.s.  Q.E.D. 


DEVELOPMENT 


The  development  will  be  a  set  of  examples  which 
will  serve  to  indicate  some  problmt  which  may  await  the 
unwary  investigator.  In  particular,  these  examples 
suggest  the  importance  of  t  careful  consideration  of  the 
underlying  probability  space. 


EXAMPLE  A:  For  an  integer  N  >  1,  let  Sj . ^ 

subsets  of  the  real  line  that  partition  the  reals  and  are  each 
saturated  non-Lebesgue  measurable.  Let  (R,  P)  be  the 
probability  space  provided  by  Theorem  I  for  these  sets 
where  the  measure  p  in  Theorem  1  is  taken  to  be  standard 
Gaussian  measure  on  S(R).  AH  random  variables  m  this 
example  will  be  defined  on  the  probability  space 

(R,  P).  Let  X(to)  =  I5  (a)  -  IgCfco).  Note  that  X  is  a 

“^1  1 

Bernoulli  random  variable,  and  P(X  =  -1 )  =  1  -  ^  and 
P(X  =  1)  =  ^ .  For  some  pq  €  G.  let  =  Otu.  where  o 

is  a  positive  real  number,  and  for  all  other  p's  in  G,  let  Yp 
=  0.  Note  that  ( Yp:  p  e  G )  is  a  Gaussian  random  field. 
Further,  note  that  0{  Ypt  p  6  G)  =  ®(R)  and  0(X)  = 

(£2,  0.  Sj,  S J }.  RecaUing  Corollary  1,  we  see  that  X  is 


independent  of  the  data  { Yp:  p  €  G).  Thus,  we  see  that  Mj^ 
=  a.s.  for  all  n  e  G.  Further,  we  see  that  P(Mjj  *  X)  = 
0  for  all  ti  €  G.  However,  if  one  knew  p^.  one  could 


reconstruct  X  precisely  from  Y_  via  X  =  g(Y_  )  poini- 

Pq  PO 

wise  on  R,  where  g'R-»R  via  g(x)  =  I5  (^]~ 


Note  that  X  can  be  precisely  written  as  a  function  of  an 
independent  random  variable.  Further,  note  that  this  can 
be  done  regardless  of  how  small  or  large  the  positive 
variance  of  Yp  is.  and  P(X  =1)  can  be  arbitrarily  small 


by  choice  of  N.  Of  course,  knowledge  of  Pq  is  crucial, 


EXAMPLE  B;  In  this  example,  we  let  the  random 
variable  X  be  Gaussian,  and  we  get  a  result  similar  to  that 
in  Example  A.  For  an  integer  N  >  1.  let  the  probability 
space  be  the  same  as  in  Example  A,  and  let 


P(X  e  B)  =  P((Sj  n  B)  u  (Sj  n  (-B)))  *  |i(B).  where  p  as 


in  Example  A  is  standard  Gaussian  measure  on  ^(R). 
Thus,  we  see  that  X  is  a  standard  Gaussian  random 


variable.  Let  the  random  field  (Yp:  p  e  G)  be  as  in 
Example  A.  Note  that  E[X  I  Yp^J  *  ~ 

~*Sj  identity 

map  is  Borel  measurable,  since  Sj  is  independent  of 


^B(R),  Hnd  since  P(S  j )  s  ^ .  In  this  case,  we  see  that  for 

Pq  £  n.  Mjj  =  ®[”^]  when  N  =  2,  Mp  =  0  aj. 

for  all  p  e  G.  On  the  other  hand,  for  large  N,  is  close 
to -0) for Pq £ n.  and  P(X  =  -<o)=  ^  •  Nevertheless,  we 

have  fliat  P(Mj|  =  X)  =  0  for  any  n  £  G.  However,  we  can 
once  again  write  X  precisely  as  a  fiinction  of  Yp^.  Thai  is. 

let  hJl— *R  via  h(x)  =  g(x),  where  g  is  as  in  Example 

A.  and  note  that  X  =  pointwise  on  R. 

EXAMPLE  C:  For  an  integer  N  >  1.  let  the  probability 
space  be  the  same  as  in  Example  A.  Let  X((o)  = 

0)(N  Notice  that  E[X]  =  0.  Also. 

notice  that  if  N  =  2,  X  is  a  Gaussian  random  variable.  Let 
the  random  field  {Ypt  p  £  G)  be  defined  via  Yp((o)  =  Sp© 

for  each  p  £  G,  where  (Sp:  p  £  G)  is  a  set  of  nonzero  real 
numbers.  Notice  that  in  this  case.  { Yp:  p  €  G)  is  a 

Gaussian  random  field,  and  each  random  variable  in  this 
random  field  has  zero  mean  and  a  positive  variance.  Now, 
what  if  we  tried  to  estimate  X  from  elements  of  the  random 
field  { Ypi  p  €  G)?  Notice  that  a(Y^:  p  £  G)  =  ®(R). 
Further,  notice  that  E[X  I  2(R)]  = 

E[©(N  '  ®(R)]  =  0  «  s..  since  the 

identity  map  is  lB(R}-measurable,  since  is  independent 
of  1B(R).  and  since  E[X]  =  0.  Thus,  for  any  n  £  G,  =  0 
a.s.  However,  notice  that  for  any  p  £  G,  X  can  be  wriaen 
precisely  as  a  function  of  Yp.  That  is.  X(w)  =  rp(Yp(©)) 
where  rp:R-*R  via 


EXAMPLE  D;  In  this  example,  assume  k  >  1,  and  let  the 

probability  space  (G,  iT ,  P)  be  given  by  R.  ®(R),  and 

standard  Gaussian  measure  on  ^(R).  Let  X(©)  z:  ©.  Let  d  be 

the  element  of  G  given  by  d  =  (e,  E, . . . ,  E).  Now,  let 

Y^  =  ©  I(_«,^  ]](©).  and  for  integers  j  >  1,  let  Y^^j  = 

©  I^j  n  in  G  but  not  equal  to  positive 

integral  multiples  of  d,  let  Y^^  =  0.  Now  how  might  we 

estimate  X?  Fix  any  point  tiq  in  G,  and  for  posiUve 

integers  i  i  k  and  positive  integers  j,  let  pj  be  the  point 

whose  coordinates  are  the  same  as  those  of  n^  except  that 

the  i-ih  coordinate  is  the  i-th  coordinate  of  rig  plus  jE. 

Then  (E[X  I  Y  . Y_  ];  j  £  N)  is  an  ordinary 

Pi  Pj 

martingale  (see  Lemma  I);  indeed,  in  the  context  of 
random  fields,  it  is  called  an  i-mariingale.  It  follows 


immediately  that  this  martingale  is  equal  ©  a  fixed 
random  variable  for  alt  j  greater  than  some  positive 
integer  J.  Further,  note  that  the  martingale  does  not 
converge  to  X.  Thus,  for  no  value  of  i  >  1,  Z  •  - . .  k. 
will  this  i-martingale  converge  ©  X.  However,  for 
positive  integers  j,  if  we  let  q^  s  jd,  the  martingale 

|EP(  1  Y  . Y  ):  j  €  N)  is  an  ordinary  martingale 

qi  Sj 

(see  Lemma  1);  and  it  follows  from  elementary  martingale 
theory  that  this  martingale  converges  in  Lp{Q,  !F ,  P),  for 

any  p  £  (1,  •«),  and  a.s.  as  m-»«»  ©  X;  indeed,  here  it  will 
converge  pointwise. 

EXAMPLE  E:  For  an  integer  N  >  1,  let  the  probability 
space  be  the  same  as  in  Example  A.  Let  X(©)  = 

©[1$  -  IgC],  As  in  Example  B,  note  that  X  is  a  standard 

Gaussian  random  variable.  Now,  for  each  n  e  C,  let  s^ 
denote  the  sum  of  the  components  of  n.  and  let  Yj^(©) 

=  ©  s^.  Notice  that  |  Y^:  p  £  G)  is  a  Gaussian  random 
field,  and  Yp  has  zero  mean  and  a  variance  of  (Sp)  .  Now. 
for  any  p  £  G,  as  in  Example  B.  Mp(©)  *  ®[~^] 

N  =  2.  we  see  that  Mp  =  0  a.s.  for  all  p  £  G.  In  any  case. 
Mp  =  E[X  I  Yjj  S  p]  must  be  a  Borel  measurable  function  of 
©t  since  ofY^:  q  S  p)  =  S(R).  and  X  is  not  a  Borel 

measurable  function  of  ©.  Thus,  once  again,  conditional 
expectation  is  of  no  help  ©  us  here.  However,  for  any 
p  £  G,  we  can  write  X  precisely  as  a  function  of  Yp;  that 

is,  X  =  —  pointwise  on  R,  where  g  is  the  function 

*p  “sp/ 

given  in  Example  A. 

EXAMPLE  F:  For  an  integer  N  >  1,  let  the  probability 
space  be  the  same  as  in  Example  A.  Itet  X(©)  =  ©,  and 
i©te  that  X  is  a  standard  Gaussian  random  variable.  For 
some  N  distinct  points  pj.  Pj . Pj^  in  G.  let  Yp  (©)  = 

©  Ir  (©),  and  for  all  other  points  p  £  G,  let  Y  =  0.  Note 
i  P 


thatXf©) 


pointwise  'n  R.  Further,  for 


certain  points  n  E  G,  Mjj(©)  »  X(©)  pointwise  on  R,  and 

the  cardinality  of  such  points  n  is  Kg.  Note  that  if  k  >  1. 

depending  on  the  location  of  the  N  points 

Pj.  Pj.  ■  •  • .  PjM  “  could  exist  a  subset  H  of  G 

having  cudinality  Kg  such  that  «  0  for  all  n  £  H. 

EXAMPLE  G:  For  an  integer  N  >  1,  let  the  probability 
space  be  the  same  as  in  Example  A.  Let  X(©)  s  ©,  and  note 
that  X  is  a  standard  Gaussian  random  variable.  For  some  N 


points  Pj.  P2 . Pn  in  G,  let  Yp  (to)  =  to  Is.(<b)  +  1, 


1^  ] 

f  N  \ 

Note  that  X((a)s^ 

-N  = 

n  Ypi«o)  j 

pointwise  on  R.  Further,  for  certain  points  n  €  G, 
Mn((o)  s  X(€a)  pointwise  on  R.  and 

card({n  c  G;  Mn(co)  =  X({0)  for  all  to  e  R))  =  Kq. 


EXAMPLE  H:  Let  k  be  greater  than  one,  and  let 
(ft.  IT .  P)  be  a  probability  space  on  which  can  be  defined 
a  random  field  (Z^:  pe  G)  of  identically  distributed. 

mutually  independent  random  variables  each  having  a 
probability  density  function  given  by 

j - £ -  iflxl>2 

x2(log  1x1)2 

f(x)  = 

0  if  1x1  <2 


where  C  is  the  norm^zing  constant,  and  a  zero  mean  unit 
variance  Gaussian  random  variable  X  independent  of 
{Zpt  p  e  G).  Define  the  random  field  (Yp:  p  e  G)  via  Yp  = 
Zp  +  X.  Notice  that  the  mean  of  Zp  exists  and  is  zero  for 
each  p  €  G.  How  might  one  anempt  to  estimate  X  from 


k 

the  data  {Ypt  p  e  G)?  For  1 6  G,  let  Itl  = 

i=l 

an  eye  on  Kolmogorov’s  strong  law  of  large  numbers,  one 
might  be  tempted  to  try  to  estimate  X  via  an  estimate  of 

the  form  Xn  =  -L.  ^  Yp.  Note  that 

tnl 

(peG-.pSn) 


Xn  =  X  Zn,  and  that  the  Z„‘s  have  zero 

Inl  ^  P 


{peG-.pSn} 


mean.  Might  we  guess  that  Xp  should  then  converge  to  X, 
in  some  meaningful  sense?  If  so,  we  would  be  well  advised 
to  guess  again,  since  it  follows  from  [3]  (see  also 
(4.  pp.  369-370J  for  the  k=2  case)  that 


lim  sup- 
peG 


Ini 


I 

{p€G:p<n) 


CONCLUSION 

We  have  developed  a  set  of  examples  pointing  out 
some  caveats  in  the  use  of  multiparametei  martingales  in 
estimation  theory.  In  particular,  we  itoted  some  instances 
in  which  estimators  existed  which  yielded  superior 
performance  than  estimators  based  on  conditional 
expectation.  We  hope  these  results  will  be  of  use  to  those 
concerned  with  such  endeavors. 
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EXAMPLE  I:  Consider  the  situation  depicted  in 
Example  H.  Recall  that  G  is  a  countable  set,  and  let 
Itjj-.  n  €  N)  be  an  enumeration  of  G.  Now,  recalling 


Kolmogorov's  strong  law  of  large  numbers,  note  that 
k 


i  =  1 


converges  to  almost  surely  to  X  as  k 


